AccumuloInputFormat with pyspark?

Kina Winoto Wed, 15 Jul 2015 09:22:13 -0700

Has anyone used the python Spark API and AccumuloInputFormat?

Using AccumuloInputFormat in scala and java within spark is
straightforward, but the python spark API's newAPIHadoopRDD function takes
in its configuration via a python dict (
https://spark.apache.org/docs/1.1.0/api/python/pyspark.context.SparkContext-class.html#newAPIHadoopRDD)
and there isn't an obvious job configuration set of keys to use. From
looking at the Accumulo source, it seems job configuration values are
stored with keys that are java enums and it's unclear to me what to use for
configuration keys in my python dict.


Any thoughts as to how to do this would be helpful!

Thanks,

Kina

AccumuloInputFormat with pyspark?

Reply via email to