any support to use Spark UDF in HIVE

2017-05-04 Thread Manohar753
HI ,

I have seen many hive udf are getting used in spark SQL,so is there any way
to do it reverse.I want to write some code on spark for UDF and the same can
be used in HIVE.
please suggest me all possible approaches in spark with JAVA.

Thaks in advance.

Regards,
Manoh



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/any-support-to-use-Spark-UDF-in-HIVE-tp28649.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



javaRDD to collectasMap throuwa ava.lang.NegativeArraySizeException

2017-04-27 Thread Manohar753


HI All,
getting the below Exception while converting my rdd to Map below is the
code.and my data size is hardly 200MD snappy file and the code looks like
this

@SuppressWarnings("unchecked")
public Tuple2, String> getMatchData(String 
location,
String key) {
ParquetInputFormat.setReadSupportClass(this.getJob(),
(Class>) 
(Class)
AvroReadSupport.class);
JavaPairRDD avroRDD =
this.getSparkContext().newAPIHadoopFile(location,
(Class>) 
(Class)
ParquetInputFormat.class, Void.class,
GenericRecord.class, 
this.getJob().getConfiguration());
JavaPairRDD kv = avroRDD.mapToPair(new
MapAvroToKV(key, new SpelExpressionParser()));
Schema schema = kv.first()._2().getSchema();
JavaPairRDD bytesKV = kv.mapToPair(new
AvroToBytesFunction());
Map map = bytesKV.collectAsMap();
Map hashmap = new HashMap(map);
return new Tuple2<>(hashmap, schema.toString());
}


please help me out if any thoughts and thanks in advance



04/27 03:13:18 INFO YarnClusterScheduler: Cancelling stage 11
04/27 03:13:18 INFO DAGScheduler: ResultStage 11 (collectAsMap at
DoubleClickSORJob.java:281) failed in 0.734 s due to Job aborted due to
stage failure: Task 0 in stage 11.0 failed 4 times, most recent failure:
Lost task 0.3 in stage 11.0 (TID 15, ip-172-31-50-58.ec2.internal, executor
1): java.io.IOException: java.lang.NegativeArraySizeException
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1276)
at
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
at
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
at
org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
at
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:81)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NegativeArraySizeException
at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:325)
at
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:60)
at
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:43)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at
org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:244)
at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$10.apply(TorrentBroadcast.scala:286)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1303)
at
org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:287)
at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:221)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1269)
... 11 more

Driver stacktrace:
04/27 03:13:18 INFO DAGScheduler: Job 11 failed: collectAsMap at
DoubleClickSORJob.java:281, took 3.651447 s
04/27 03:13:18 ERROR ApplicationMaster: User class threw exception:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in
stage 11.0 failed 4 times, most recent failure: Lost task 0.3 in stage 11.0
(TID 15, ip-172-31-50-58.ec2.internal, executor 1): java.io.IOException:
java.lang.NegativeArraySizeException
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1276)
at
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
at
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
at
org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
at
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:81)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   

Class Cast Exception while read from GS and write to S3.I feel gettng while writeing to s3.

2017-02-18 Thread Manohar753
Hi All,
able to run my simple spark job Read and write to S3 in local ,when i move
to cluster gettng below cast exception.Spark Environment a using 2.0.1.
please help out if any has faced this kind of issue already.



02/18 10:35:23 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
ip-172-31-45-63.ec2.internal): java.io.IOException:
java.lang.ClassCastException: cannot assign instance of scala.Some to field
org.apache.spark.util.AccumulatorMetadata.name of type scala.Option in
instance of org.apache.spark.util.AccumulatorMetadata
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1283)
at 
org.apache.spark.util.AccumulatorV2.readObject(AccumulatorV2.scala:171)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:253)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException: cannot assign instance of
scala.Some to field org.apache.spark.util.AccumulatorMetadata.name of type
scala.Option in instance of org.apache.spark.util.AccumulatorMetadata
at
java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133)
at 
java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2237)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Class-Cast-Exception-while-read-from-GS-and-write-to-S3-I-feel-gettng-while-writeing-to-s3-tp28403.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



JavaRDD text matadata(file name) findings

2017-01-31 Thread Manohar753
Hi All,
myspark job is reading data from a folder having different files with same
structured data.
the red JavaRdd processed line by line but is there any way to know from
which file the line of data came.
Team thank you in advance for your reply coming.

Thanks,



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/JavaRDD-text-matadata-file-name-findings-tp28353.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Will be in around 12:30pm due to some personal stuff

2017-01-19 Thread Manohar753
Get Outlook for Android


Happiest Minds Disclaimer

This message is for the sole use of the intended recipient(s) and may contain 
confidential, proprietary or legally privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
original intended recipient of the message, please contact the sender by reply 
email and destroy all copies of the original message.

Happiest Minds Technologies 






--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Will-be-in-around-12-30pm-due-to-some-personal-stuff-tp28326.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Spark Read from Google store and save in AWS s3

2017-01-05 Thread Manohar753
Hi All,

Using spark is  interoperability communication between two
clouds(Google,AWS) possible.
in my use case i need to take Google store as input to spark and do some
processing and finally needs to store in S3 and my spark engine runs on AWS
Cluster.

Please let me back is there any way for this kind of usecase bu using
directly spark without any middle components and share the info or link if
you have.

Thanks,




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Read-from-Google-store-and-save-in-AWS-s3-tp28278.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Spark java with Google Store

2017-01-05 Thread Manohar753
Hi Team,

Can some please share any examples on spark java read and write files from
Google Store.

Thanks You in advance. 




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-java-with-Google-Store-tp28276.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Spark Version upgrade isue:Exception in thread "main" java.lang.NoSuchMethodError

2015-08-28 Thread Manohar753
Hi Team,
I upgraded spark older versions to 1.4.1 after maven build i tried to ran my
simple application but it failed and giving the below stacktrace.

Exception in thread "main" java.lang.NoSuchMethodError:
com.fasterxml.jackson.databind.introspect.POJOPropertyBuilder.addField(Lcom/fasterxml/jackson/databind/introspect/AnnotatedField;Lcom/fasterxml/jackson/databind/PropertyName;ZZZ)V
at
com.fasterxml.jackson.module.scala.introspect.ScalaPropertiesCollector.com$fasterxml$jackson$module$scala$introspect$ScalaPropertiesCollector$$_addField(ScalaPropertiesCollector.scala:109)

i checked all the forex jackson versions but no luck



Any help on this if some body already faced this issue.

Thanks  in adcance.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Version-upgrade-isue-Exception-in-thread-main-java-lang-NoSuchMethodError-tp24488.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



java.lang.ArrayIndexOutOfBoundsException: 0 on Yarn Client

2015-07-27 Thread Manohar753
Hi Team,

can please some body help me out what am doing wrong to get the below
exception while running my app on Yarn cluster with spark 1.4.

Kafka stream am getting AND DOING foreachRDD and giving it to new thread for
process.please find the below code snippet.

JavaDStream unionStreams = ReceiverLauncher.launch(
jsc, props, numberOfReceivers, 
StorageLevel.MEMORY_ONLY());
unionStreams
.foreachRDD(new 
Function2, Time, Void>()
{

@Override
public Void 
call(JavaRDD rdd, Time time)
throws Exception {
new ThreadParam(rdd).start();


return null;
}
});
#
public ThreadParam(JavaRDD rdd) {
this.rdd = rdd;
//  this.context=context;
}

public void run(){
final List fields = new ArrayList();
List listvalues=new ArrayList<>();
final List meta=new ArrayList<>();

JavaRDD rowrdd=rdd.map(new Function() {   
@Override
public Row call(MessageAndMetadata arg0) throws Exception {
String[] data=new 
String(arg0.getPayload()).split("\\|");
int i=0;
List fields = new ArrayList();
List listvalues=new ArrayList<>();
List meta=new ArrayList<>();
for (String string : data) {
if(i>3){
if(i%2==0){
  
fields.add(DataTypes.createStructField(string, DataTypes.StringType,
true));
//  System.out.println(splitarr[i]);
}else{
listvalues.add(string);
//  System.out.println(splitarr[i]);
}
}else{
meta.add(string);
}
i++;
}int size=listvalues.size();
return
RowFactory.create(listvalues.get(25-25),listvalues.get(25-24),listvalues.get(25-23),

listvalues.get(25-22),listvalues.get(25-21),listvalues.get(25-20),

listvalues.get(25-19),listvalues.get(25-18),listvalues.get(25-17),

listvalues.get(25-16),listvalues.get(25-15),listvalues.get(25-14),

listvalues.get(25-13),listvalues.get(25-12),listvalues.get(25-11),

listvalues.get(25-10),listvalues.get(25-9),listvalues.get(25-8),

listvalues.get(25-7),listvalues.get(25-6),listvalues.get(25-5),

listvalues.get(25-4),listvalues.get(25-3),listvalues.get(25-2),listvalues.get(25-1));

}
});

SQLContext sqlContext = new SQLContext(rowrdd.context());
StructType schema = DataTypes.createStructType(fields);
System.out.println("before creating schema");
DataFrame courseDf=sqlContext.createDataFrame(rowrdd, schema);
courseDf.registerTempTable("course");
courseDf.show();
System.out.println("after creating schema");


BELOW IS THE  COMMAND TO RUN THIS AND XENT FOR THAT IS THE STACKTRACE eRROR
 MASTER=yarn-client /home/hadoop/spark/bin/spark-submit --class
com.person.Consumer
/mnt1/manohar/spark-load-from-db/targetpark-load-from-db-1.0-SNAPSHOT-jar-with-dependencies.jar


ERROR IS AS


15/07/27 14:45:01 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 4.0
(TID 72, ip-10-252-7-73.us-west-2.compute.internal):
java.lang.ArrayIndexOutOfBoundsException: 0
at
org.apache.spark.sql.catalyst.CatalystTypeConverters$.convertRowWithConverters(CatalystTypeConverters.scala:348)
at
org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$4.apply(CatalystTypeConverters.scala:180)
at
org.apache.spark.sql.SQLContext$$anonfun$9.apply(SQLContext.scala:488)
at
org.apache.spark.sql.SQLContext$$anonfun$9.apply(SQLContext.scala:488)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$10.next(Iter

DataFrame InsertIntoJdbc() Runtime Exception on cluster

2015-07-15 Thread Manohar753
Hi All,

Am trying to add few new rows for existing table in mysql using
DataFrame.But it is adding new rows to the table in local environment but on
spark cluster below is the runtime exception.


Exception in thread "main" java.lang.RuntimeException: Table msusers_1
already exists.
at scala.sys.package$.error(package.scala:27)
at
org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:240)
at
org.apache.spark.sql.DataFrame.insertIntoJDBC(DataFrame.scala:1481)
at com.sparkexpert.UserMigration.main(UserMigration.java:59)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
at
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/07/15 08:13:42 INFO spark.SparkContext: Invoking stop() from shutdown
hook
15/07/15 08:13:42 INFO handler.ContextHandler: stopped
o.s.j.s.ServletContextHandler{/metrics/json,null}
15/07/15 08:13:

code snippet is below:

System.out.println(Query);
Map options = new HashMap<>();
options.put("driver",
PropertyLoader.getProperty(Constants.msSqlDriver));
options.put("url", PropertyLoader.getProperty(Constants.msSqlURL));
options.put("dbtable",Query);   
options.put("numPartitions", "1");
DataFrame delatUsers = sqlContext.load("jdbc", options);


delatUsers.show();
//Load latest users DataFrame

String mysQuery="(SELECT * FROM msusers_1) as employees_name";
Map msoptions = new HashMap<>();
   
msoptions.put("driver",PropertyLoader.getProperty(Constants.mysqlDriver));
msoptions.put("url",
PropertyLoader.getProperty(Constants.mysqlUrl));
msoptions.put("dbtable",mysQuery);   
msoptions.put("numPartitions", "1");
DataFrame latestUsers = sqlContext.load("jdbc", msoptions); 

//Get Update users Data
DataFrame updatedUsers =   
delatUsers.as("ms").join(latestUsers.as("lat"),
col("lat.uid").equalTo(col("ms.uid")),
"inner").select("ms.revision","ms.uid","ms.UserType","ms.FirstName","ms.LastName","ms.Email","ms.smsuser_id","ms.dev_acct","ms.lastlogin","ms.username","ms.schoolAffiliation","ms.authsystem_id","ms.AdminStatus");
 //Insert new users into Mysql DB
*   
delatUsers.except(updatedUsers).insertIntoJDBC(PropertyLoader.getProperty(Constants.mysqlUrl),
"msusers_1", false);
*
 the bold line is the Exception occur line.
Team please give me some inputs if any one had come across this .
but for the same override the table is working fine on cluster also.

Thanks,
manoar



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/DataFrame-InsertIntoJdbc-Runtime-Exception-on-cluster-tp23851.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: lower and upper offset not working in spark with mysql database

2015-07-05 Thread Manohar753
I think you should mention partitionColumn like below and the Colum type should 
be numeric. It works for my case.

options.put("partitionColumn", "revision");


Thanks,
Manohar


From: Hafiz Mujadid [via Apache Spark User List] 
[mailto:ml-node+s1001560n23635...@n3.nabble.com]
Sent: Monday, July 6, 2015 10:56 AM
To: Manohar Reddy
Subject: lower and upper offset not working in spark with mysql database

Hi all!

I am trying to read records from offset 100 to 110 from a table using following 
piece of code

val sc = new SparkContext(new 
SparkConf().setAppName("SparkJdbcDs").setMaster("local[*]"))
val sqlContext = new SQLContext(sc)
val options = new HashMap[String, String]()
options.put("driver", "com.mysql.jdbc.Driver")
options.put("url", "jdbc:mysql://***:3306/temp?user=&password=")
options.put("dbtable", "tempTable")
options.put("lowerBound", "100")
options.put("upperBound", "110")
options.put("numPartitions", "1")
sqlContext.load("jdbc", options)


but this returns all the records instead of only 10 records


If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/lower-and-upper-offset-not-working-in-spark-with-mysql-database-tp23635.html
To start a new topic under Apache Spark User List, email 
ml-node+s1001560n1...@n3.nabble.com
To unsubscribe from Apache Spark User List, click 
here.
NAML

Happiest Minds Disclaimer

This message is for the sole use of the intended recipient(s) and may contain 
confidential, proprietary or legally privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
original intended recipient of the message, please contact the sender by reply 
email and destroy all copies of the original message.

Happiest Minds Technologies 






--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/lower-and-upper-offset-not-working-in-spark-with-mysql-database-tp23635p23637.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

JDBCRDD sync with mssql

2015-06-25 Thread Manohar753
Hi Team,

 in my usecase i need to sync the data with mssql for any operation in
mssql.but as per my spark knowledge we have JDBCRDD it will read data from
rdbms tables with upper and lower limits.
someone please help is there any API to sync data automatically from single
rdbms table for any DML happen on the table.

if no existing configuration/API for the above usecase,i can store the upper
and lower limits of the tables for last read and used for next ream.

Please suggest if  any body have inputs on this.that will be great help.

Thank You In Advance





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/JDBCRDD-sync-with-mssql-tp23487.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



JavaDStream read and write rdbms

2015-06-22 Thread Manohar753

Hi Team,

How to  split and put the red JavaDStream in to mysql in java.

any existing api in sark 1.3/1.4.
team can you please share the code snippet if any body have it.

Thanks,
Manohar




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/JavaDStream-String-read-and-write-rdbms-tp23423.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



N kafka topics vs N spark Streaming

2015-06-19 Thread Manohar753
Hi Everybody,

I have four kafks topics each for
separateoperation(Add,Delete,Update,Merge).
so spark also will have four consumed streams,so how we can run my spark job
here?

should i run four spark jobs separately?
is there any way to bundle all streams  into singlejar and run as single
Job?

Thanks in Advance.





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/N-kafka-topics-vs-N-spark-Streaming-tp23408.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



how to maintain the offset for spark streaming if HDFS is the source

2015-06-16 Thread Manohar753
Hi All,
In my usecase  HDFS  file as  source for Spark Stream,
the job will process the data line by line but how will make sure to
maintain the offset line number(data already processed) while restarting/new
code push .

Team can you please reply on this is there any configuration in Spark.


Thanks.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/how-to-maintain-the-offset-for-spark-streaming-if-HDFS-is-the-source-tp23336.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org