Hello Rzykov, I tried you approach but unfortunately I'm getting an error. What I get is:
[info] Caused by: java.lang.reflect.InvocationTargetException [info] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) [info] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) [info] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) [info] at java.lang.reflect.Constructor.newInstance(Constructor.java:526) [info] at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155) [info] ... 22 more [info] Caused by: java.lang.IncompatibleClassChangeError: Found class org.apache.hadoop.mapreduce.TaskAttemptContext, but interface was expected [info] at ru.retailrocket.spark.multitool.Loaders$CombineTextFileRecordReader.<init>(Loaders.scala:31) [info] ... 27 more I saw that you tested with Spark 1.1.0 and but I am forced to use 1.0.2 currently. Perhaps that is the source of the error. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Optimizing-text-file-parsing-many-small-files-versus-few-big-files-tp19266p19369.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
