[ 
https://issues.apache.org/jira/browse/BEAM-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952999#comment-15952999
 ] 

Wesley Tanaka edited comment on BEAM-1859 at 4/3/17 5:06 AM:
-------------------------------------------------------------

Adding org.apache.hadoop:hadoop-core:0.20.2 as a dependency explicitly does 
resolve the issue, thanks, I'll just do that; I didn't know that it was a best 
practice to assume it was already installed.

In case it helps to know it, my use case is one of learning the Beam API, not 
of trying to actually accomplish something with it:

* I am trying to learn the Beam API
* So I am trying to create different toy composite PTransforms
* and I'd like to speed up the code/test/debug cycle relative to uploading code 
into a cluster
* so, despite this being nonsensical w.r.t. the actual use of Beam, I am trying 
to hack together some code to get DirectRunner to read lines from stdin and 
write lines to stdout and run the same code against different inputs to see how 
it behaves.

In case it's also interesting to know, in my actual use case, I don't actually 
have Hadoop setup, I'm using Beam with only Flink and Kafka at the moment.


was (Author: wtanaka):
Adding org.apache.hadoop:hadoop-core:0.20.2 as a dependency explicitly does 
resolve the issue, thanks, I'll just do that; I didn't know that it was a best 
practice to assume it was already installed.

In case it helps to know it, my use case is one of learning the Beam API, not 
of trying to actually accomplish something with it:

* I am trying to learn the Beam API
* So I am trying to create different toy composite PTransforms
* and I'd like to speed up the code/test/debug cycle relative to uploading code 
into a cluster
* so, despite this being nonsensical w.r.t. the actual use of Beam, I am trying 
to hack together some code to get DirectRunner to read lines from stdin and 
write lines to stdout and run the same code against different inputs to see how 
it behaves.

> sorter extension depends on hadoop but does not declare as such in repository 
> artifact
> --------------------------------------------------------------------------------------
>
>                 Key: BEAM-1859
>                 URL: https://issues.apache.org/jira/browse/BEAM-1859
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-extensions
>    Affects Versions: 0.6.0
>            Reporter: Wesley Tanaka
>            Assignee: Davor Bonaci
>             Fix For: Not applicable
>
>
> When SortValues is used via 
> {{org.apache.beam:beam-sdks-java-extensions-sorter:0.6.0}}, this exception is 
> raised:
> {noformat}
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/hadoop/conf/Configuration
>       at 
> org.apache.beam.sdk.extensions.sorter.BufferedExternalSorter.create(BufferedExternalSorter.java:98)
>       at 
> org.apache.beam.sdk.extensions.sorter.SortValues$SortValuesDoFn.processElement(SortValues.java:153)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.conf.Configuration
>       at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>       at 
> org.apache.beam.sdk.extensions.sorter.BufferedExternalSorter.create(BufferedExternalSorter.java:98)
>       at 
> org.apache.beam.sdk.extensions.sorter.SortValues$SortValuesDoFn.processElement(SortValues.java:153)
>       at 
> org.apache.beam.sdk.extensions.sorter.SortValues$SortValuesDoFn$auxiliary$uK25yOmK.invokeProcessElement(Unknown
>  Source)
>       at 
> org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:198)
>       at 
> org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:159)
>       at 
> org.apache.beam.runners.core.PushbackSideInputDoFnRunner.processElement(PushbackSideInputDoFnRunner.java:111)
>       at 
> org.apache.beam.runners.core.PushbackSideInputDoFnRunner.processElementInReadyWindows(PushbackSideInputDoFnRunner.java:77)
>       at 
> org.apache.beam.runners.direct.ParDoEvaluator.processElement(ParDoEvaluator.java:134)
>       at 
> org.apache.beam.runners.direct.DoFnLifecycleManagerRemovingTransformEvaluator.processElement(DoFnLifecycleManagerRemovingTransformEvaluator.java:51)
>       at 
> org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139)
>       at 
> org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> {noformat}
> I think the issue is that beam-sdks-java-extensions-sorter should declare 
> that it depends on that hadoop library but does not?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to