Re: spark 2.0 readStream from a REST API

2016-08-02 Thread Ayoub Benali
GMT+02:00 Amit Sela : > I think you're missing: > > val query = wordCounts.writeStream > > .outputMode("complete") > .format("console") > .start() > > Dis it help ? > > On Mon, Aug 1, 2016 at 2:44 PM Jacek Laskowski wrote: > >

Re: spark 2.0 readStream from a REST API

2016-08-02 Thread Ayoub Benali
Hello, here is the code I am trying to run: https://gist.github.com/ayoub-benali/a96163c711b4fce1bdddf16b911475f2 Thanks, Ayoub. 2016-08-01 13:44 GMT+02:00 Jacek Laskowski : > On Mon, Aug 1, 2016 at 11:01 AM, Ayoub Benali > wrote: > > > the problem now is that when I consum

Re: spark 2.0 readStream from a REST API

2016-08-01 Thread Ayoub Benali
hael Armbrust : > You have to add a file in resource too (example > <https://github.com/apache/spark/blob/master/sql/core/src/main/resources/META-INF/services/org.apache.spark.sql.sources.DataSourceRegister>). > Either that or give a full class name. > > On Sun, Jul 31, 2016

Re: spark 2.0 readStream from a REST API

2016-07-31 Thread Ayoub Benali
urce: mysource. Please find packages at http://spark-packages.org Is there something I need to do in order to "load" the Stream source provider ? Thanks, Ayoub 2016-07-31 17:19 GMT+02:00 Jacek Laskowski : > On Sun, Jul 31, 2016 at 12:53 PM, Ayoub Benali > wrote: > > > I st

spark 2.0 readStream from a REST API

2016-07-31 Thread Ayoub Benali
Hello, I started playing with the Structured Streaming API in spark 2.0 and I am looking for a way to create streaming Dataset/Dataframe from a rest HTTP endpoint but I am bit stuck. "readStream" in SparkSession has a json method but this one is expecting a path (s3, hdfs, etc) and I want to avoi

Re: RDD[Future[T]] => Future[RDD[T]]

2015-07-26 Thread Ayoub Benali
It doesn't work because mapPartitions expects a function f:(Iterator[T]) ⇒ Iterator[U] while .sequence wraps the iterator in a Future 2015-07-26 22:25 GMT+02:00 Ignacio Blasco : > Maybe using mapPartitions and .sequence inside it? > El 26/7/2015 10:22 p. m., "Ayoub" escribió: > >> Hello, >> >> I

Re: SQL JSON array operations

2015-01-15 Thread Ayoub Benali
You could try yo use hive context which bring HiveQL, it would allow you to query nested structures using "LATERAL VIEW explode..." On Jan 15, 2015 4:03 PM, "jvuillermet" wrote: > let's say my json file lines looks like this > > {"user": "baz", "tags" : ["foo", "bar"] } > > > sqlContext.json

Re: Parquet compression codecs not applied

2015-01-10 Thread Ayoub Benali
it worked thanks. this doc page recommends to use "spark.sql.parquet.compression.codec" to set the compression coded and I thought this setting would be forwarded to the hive context given that HiveContext extends SQLContext, but it w

Parquet compression codecs not applied

2015-01-08 Thread Ayoub Benali
Hello, I tried to save a table created via the hive context as a parquet file but whatever compression codec (uncompressed, snappy, gzip or lzo) I set via setConf like: setConf("spark.sql.parquet.compression.codec", "gzip") the size of the generated files is the always the same, so it seems like