Re: [Spark Streaming] Runtime Error in call to max function for JavaPairRDD

Tathagata Das Mon, 22 Jun 2015 23:31:17 -0700

Try adding the provided scopes

    <dependency> <!-- Spark dependency -->
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.10</artifactId>
        <version>1.4.0</version>


*        <scope>provided</scope>      *    </dependency>
    <dependency> <!-- Spark Streaming dependency -->
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.10</artifactId>
        <version>1.4.0</version>

*        <scope>provided</scope>      *    </dependency>

This prevents these artifacts from being included in the assembly JARs.

See scope
https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Scope

On Mon, Jun 22, 2015 at 10:28 AM, Nipun Arora <[email protected]>
wrote:

> Hi Tathagata,
>
> I am attaching a snapshot of my pom.xml. It would help immensely, if I can
> include max, and min values in my mapper phase.
>
> The question is still open at :
> http://stackoverflow.com/questions/30902090/adding-max-and-min-in-spark-stream-in-java/30909796#30909796
>
> I see that there is a bug report filed for a similar error as well:
> https://issues.apache.org/jira/browse/SPARK-3266
>
> Please let me know, how I can get the same version of spark streaming in
> my assembly.
> I am using the following spark version:
> http://www.apache.org/dyn/closer.cgi/spark/spark-1.4.0/spark-1.4.0-bin-hadoop2.6.tgz
> .. no compilation, just an untar and use the spark-submit script in a local
> install.
>
>
> I still get the same error.
>
> Exception in thread "JobGenerator" java.lang.NoSuchMethodError: 
> org.apache.spark.api.java.JavaPairRDD.max(Ljava/util/Comparator;)Lscala/Tuple2;
>
> <dependencies>
>     <dependency> <!-- Spark dependency -->
>         <groupId>org.apache.spark</groupId>
>         <artifactId>spark-core_2.10</artifactId>
>         <version>1.4.0</version>
>     </dependency>
>     <dependency> <!-- Spark Streaming dependency -->
>         <groupId>org.apache.spark</groupId>
>         <artifactId>spark-streaming_2.10</artifactId>
>         <version>1.4.0</version>
>     </dependency>
>
> Thanks
>
> Nipun
>
>
> On Thu, Jun 18, 2015 at 11:16 PM, Nipun Arora <[email protected]>
> wrote:
>
>> Hi Tathagata,
>>
>> When you say please mark spark-core and spark-streaming as dependencies
>> how do you mean?
>> I have installed the pre-build spark-1.4 for Hadoop 2.6 from spark
>> downloads. In my maven pom.xml, I am using version 1.4 as described.
>>
>> Please let me know how I can fix that?
>>
>> Thanks
>> Nipun
>>
>> On Thu, Jun 18, 2015 at 4:22 PM, Tathagata Das <[email protected]>
>> wrote:
>>
>>> I think you may be including a different version of Spark Streaming in
>>> your assembly. Please mark spark-core nd spark-streaming as provided
>>> dependencies. Any installation of Spark will automatically provide Spark in
>>> the classpath so you do not have to bundle it.
>>>
>>> On Thu, Jun 18, 2015 at 8:44 AM, Nipun Arora <[email protected]>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have the following piece of code, where I am trying to transform a
>>>> spark stream and add min and max to it of eachRDD. However, I get an error
>>>> saying max call does not exist, at run-time (compiles properly). I am using
>>>> spark-1.4
>>>>
>>>> I have added the question to stackoverflow as well:
>>>> http://stackoverflow.com/questions/30902090/adding-max-and-min-in-spark-stream-in-java/30909796#30909796
>>>>
>>>> Any help is greatly appreciated :)
>>>>
>>>> Thanks
>>>> Nipun
>>>>
>>>> JavaPairDStream<Tuple2<Long, Integer>, Tuple3<Integer,Long,Long>> 
>>>> sortedtsStream = transformedMaxMintsStream.transformToPair(new Sort2());
>>>>
>>>> sortedtsStream.foreach(
>>>>         new Function<JavaPairRDD<Tuple2<Long, Integer>, Tuple3<Integer, 
>>>> Long, Long>>, Void>() {
>>>>             @Override
>>>>             public Void call(JavaPairRDD<Tuple2<Long, Integer>, 
>>>> Tuple3<Integer, Long, Long>> tuple2Tuple3JavaPairRDD) throws Exception {
>>>>                 List<Tuple2<Tuple2<Long, Integer>, 
>>>> Tuple3<Integer,Long,Long>> >templist = tuple2Tuple3JavaPairRDD.collect();
>>>>                 for(Tuple2<Tuple2<Long,Integer>, 
>>>> Tuple3<Integer,Long,Long>> tuple :templist){
>>>>
>>>>                     Date date = new Date(tuple._1._1);
>>>>                     int pattern = tuple._1._2;
>>>>                     int count = tuple._2._1();
>>>>                     Date maxDate = new Date(tuple._2._2());
>>>>                     Date minDate = new Date(tuple._2._2());
>>>>                     System.out.println("TimeSlot: " + date.toString() + " 
>>>> Pattern: " + pattern + " Count: " + count + " Max: " + maxDate.toString() 
>>>> + " Min: " + minDate.toString());
>>>>
>>>>                 }
>>>>                 return null;
>>>>             }
>>>>         }
>>>> );
>>>>
>>>> Error:
>>>>
>>>>
>>>> 15/06/18 11:05:06 INFO BlockManagerInfo: Added input-0-1434639906000 in 
>>>> memory on localhost:42829 (size: 464.0 KB, free: 264.9 MB)15/06/18 
>>>> 11:05:06 INFO BlockGenerator: Pushed block input-0-1434639906000Exception 
>>>> in thread "JobGenerator" java.lang.NoSuchMethodError: 
>>>> org.apache.spark.api.java.JavaPairRDD.max(Ljava/util/Comparator;)Lscala/Tuple2;
>>>>         at 
>>>> org.necla.ngla.spark_streaming.MinMax.call(Type4ViolationChecker.java:346)
>>>>         at 
>>>> org.necla.ngla.spark_streaming.MinMax.call(Type4ViolationChecker.java:340)
>>>>         at 
>>>> org.apache.spark.streaming.api.java.JavaDStreamLike$class.scalaTransform$3(JavaDStreamLike.scala:360)
>>>>         at 
>>>> org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$transformToPair$1.apply(JavaDStreamLike.scala:361)
>>>>         at 
>>>> org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$transformToPair$1.apply(JavaDStreamLike.scala:361)
>>>>         at 
>>>> org.apache.spark.streaming.dstream.DStream$$anonfun$transform$1$$anonf
>>>>
>>>>
>>>
>>
>

Re: [Spark Streaming] Runtime Error in call to max function for JavaPairRDD

Reply via email to