Re: How to set KryoRegistrator class in spark-shell
Another option would be to close sc and open new context with your custom configuration On Jun 11, 2015 01:17, bhomass bhom...@gmail.com wrote: you need to register using spark-default.xml as explained here https://books.google.com/books?id=WE_GBwAAQBAJpg=PA239lpg=PA239dq=spark+shell+register+kryo+serializationsource=blots=vCxgEfz1-2sig=dHU8FY81zVoBqYIJbCFuRwyFjAwhl=ensa=Xved=0CEwQ6AEwB2oVChMIn_iujpCGxgIVDZmICh3kYADW#v=onepageq=spark%20shell%20register%20kryo%20serializationf=false -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-KryoRegistrator-class-in-spark-shell-tp12498p23265.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How to set KryoRegistrator class in spark-shell
Or launch the spark-shell with --conf spark.kryo.registrator=foo.bar.MyClass 2015-06-11 14:30 GMT+02:00 Igor Berman igor.ber...@gmail.com: Another option would be to close sc and open new context with your custom configuration On Jun 11, 2015 01:17, bhomass bhom...@gmail.com wrote: you need to register using spark-default.xml as explained here https://books.google.com/books?id=WE_GBwAAQBAJpg=PA239lpg=PA239dq=spark+shell+register+kryo+serializationsource=blots=vCxgEfz1-2sig=dHU8FY81zVoBqYIJbCFuRwyFjAwhl=ensa=Xved=0CEwQ6AEwB2oVChMIn_iujpCGxgIVDZmICh3kYADW#v=onepageq=spark%20shell%20register%20kryo%20serializationf=false -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-KryoRegistrator-class-in-spark-shell-tp12498p23265.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How to set KryoRegistrator class in spark-shell
you need to register using spark-default.xml as explained here https://books.google.com/books?id=WE_GBwAAQBAJpg=PA239lpg=PA239dq=spark+shell+register+kryo+serializationsource=blots=vCxgEfz1-2sig=dHU8FY81zVoBqYIJbCFuRwyFjAwhl=ensa=Xved=0CEwQ6AEwB2oVChMIn_iujpCGxgIVDZmICh3kYADW#v=onepageq=spark%20shell%20register%20kryo%20serializationf=false -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-KryoRegistrator-class-in-spark-shell-tp12498p23265.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
How to set KryoRegistrator class in spark-shell
I want to use opencsv's CSVParser to parse csv lines using a script like below in spark-shell: import au.com.bytecode.opencsv.CSVParser; import com.esotericsoftware.kryo.Kryo import org.apache.spark.serializer.KryoRegistrator import org.apache.hadoop.fs.{Path, FileSystem} class MyKryoRegistrator extends KryoRegistrator { override def registerClasses(kryo:Kryo) { kryo.register(classOf[CSVParser]) } } val outDir=/tmp/dmc-out val fs = FileSystem.get(sc.hadoopConfiguration) fs.delete(new Path(outDir), true); val largeLines = sc.textFile(/tmp/dmc-03-08/*.gz) val parser = new CSVParser('|', '') largeLines.map(parser.parseLine(_).toList).saveAsTextFile(outDir, classOf[org.apache.hadoop.io.compress.GzipCodec]) If I start spark-shell with spark.kryo.registrator like this SPARK_JAVA_OPTS=-Dspark.serializer=org.apache.spark.serializer.KryoSerializer -Dspark.kryo.registrator=MyKryoRegistrator spark-shell it complains that MyKroRegistrator not found when I run :load my_script in spark-shell. 14/08/20 12:14:01 ERROR KryoSerializer: Failed to run spark.kryo.registrator java.lang.ClassNotFoundException: MyKryoRegistrator What's wrong?
RE: How to set KryoRegistrator class in spark-shell
Hi Wang,Have you tried doing this in your application? conf.set(spark.serializer, org.apache.spark.serializer.KryoSerializer) conf.set(spark.kryo.registrator, yourpackage.MyKryoRegistrator) You then don't need to specify it via commandline. Date: Wed, 20 Aug 2014 12:25:14 -0700 Subject: How to set KryoRegistrator class in spark-shell From: bewang.t...@gmail.com To: user@spark.apache.org I want to use opencsv's CSVParser to parse csv lines using a script like below in spark-shell: import au.com.bytecode.opencsv.CSVParser;import com.esotericsoftware.kryo.Kryo import org.apache.spark.serializer.KryoRegistratorimport org.apache.hadoop.fs.{Path, FileSystem} class MyKryoRegistrator extends KryoRegistrator { override def registerClasses(kryo:Kryo) { kryo.register(classOf[CSVParser]) }} val outDir=/tmp/dmc-out val fs = FileSystem.get(sc.hadoopConfiguration)fs.delete(new Path(outDir), true); val largeLines = sc.textFile(/tmp/dmc-03-08/*.gz)val parser = new CSVParser('|', '')largeLines.map(parser.parseLine(_).toList).saveAsTextFile(outDir, classOf[org.apache.hadoop.io.compress.GzipCodec]) If I start spark-shell with spark.kryo.registrator like this SPARK_JAVA_OPTS=-Dspark.serializer=org.apache.spark.serializer.KryoSerializer -Dspark.kryo.registrator=MyKryoRegistrator spark-shell it complains that MyKroRegistrator not found when I run :load my_script in spark-shell. 14/08/20 12:14:01 ERROR KryoSerializer: Failed to run spark.kryo.registrator java.lang.ClassNotFoundException: MyKryoRegistrator What's wrong?
Re: How to set KryoRegistrator class in spark-shell
I can do that in my application, but I really want to know how I can do it in spark-shell because I usually prototype in spark-shell before I put the code into an application. On Wed, Aug 20, 2014 at 12:47 PM, Sameer Tilak ssti...@live.com wrote: Hi Wang, Have you tried doing this in your application? conf.set(spark.serializer, org.apache.spark.serializer.KryoSerializer) conf.set(spark.kryo.registrator, yourpackage.MyKryoRegistrator) You then don't need to specify it via commandline. -- Date: Wed, 20 Aug 2014 12:25:14 -0700 Subject: How to set KryoRegistrator class in spark-shell From: bewang.t...@gmail.com To: user@spark.apache.org I want to use opencsv's CSVParser to parse csv lines using a script like below in spark-shell: import au.com.bytecode.opencsv.CSVParser; import com.esotericsoftware.kryo.Kryo import org.apache.spark.serializer.KryoRegistrator import org.apache.hadoop.fs.{Path, FileSystem} class MyKryoRegistrator extends KryoRegistrator { override def registerClasses(kryo:Kryo) { kryo.register(classOf[CSVParser]) } } val outDir=/tmp/dmc-out val fs = FileSystem.get(sc.hadoopConfiguration) fs.delete(new Path(outDir), true); val largeLines = sc.textFile(/tmp/dmc-03-08/*.gz) val parser = new CSVParser('|', '') largeLines.map(parser.parseLine(_).toList).saveAsTextFile(outDir, classOf[org.apache.hadoop.io.compress.GzipCodec]) If I start spark-shell with spark.kryo.registrator like this SPARK_JAVA_OPTS=-Dspark.serializer=org.apache.spark.serializer.KryoSerializer -Dspark.kryo.registrator=MyKryoRegistrator spark-shell it complains that MyKroRegistrator not found when I run :load my_script in spark-shell. 14/08/20 12:14:01 ERROR KryoSerializer: Failed to run spark.kryo.registrator java.lang.ClassNotFoundException: MyKryoRegistrator What's wrong?