Hive on Spark Job Monitoring

2017-03-16 Thread Ninad Shringarpure
Hi Team, I wanted to understand how Hive on Spark actually maps to Spark jobs underneath triggered by Hive. AFAIK each Hive query would trigger a new Spark job. But this was contradicted by someone and wanted to confirm what is the real design implementation. Please let me know if there is

Making withColumn nullable

2017-01-27 Thread Ninad Shringarpure
HI Team, When I add a column to my data frame using withColumn and assign some value, it automatically creates the schema with this column to be not nullable. My final Hive table schema where I want to insert it has this column to be nullable and hence throws an error when I try to save. Is

Creating UUID using SparksSQL

2017-01-18 Thread Ninad Shringarpure
Hi Team, Is there a standard way of generating a unique id for each row in from Spark SQL. I am looking for functionality similar to UUID generation in hive. Let me know if you need any additional information. Thanks, Ninad

[DataFrames] map function - 2.0

2016-12-15 Thread Ninad Shringarpure
Hi Team, When going through Dataset class for Spark 2.0 it comes across that both overloaded map functions with encoder and without are marked as experimental. Is there a reason and issues that developers whould be aware of when using this for production applications. Also is there a

Re: [Spark-SQL] collect_list() support for nested collection

2016-12-13 Thread Ninad Shringarpure
0/2840265927289860/latest.html > > On Tue, Dec 13, 2016 at 10:43 AM, Ninad Shringarpure <ni...@cloudera.com> > wrote: > > > Hi Team, > > > > > > > > > > > > > > > > > > Does Spark 2.0 support non-primitive types in collect_l

Fwd: [Spark-SQL] collect_list() support for nested collection

2016-12-13 Thread Ninad Shringarpure
Hi Team, Does Spark 2.0 support non-primitive types in collect_list for inserting nested collections? Would appreciate any references or samples. Thanks, Ninad

Unsubscribe

2016-11-28 Thread Ninad Shringarpure
Unsubscribe

Fwd: jdbcRDD for data ingestion from RDBMS

2016-10-17 Thread Ninad Shringarpure
Hi Team, One of my client teams is trying to see if they can use Spark to source data from RDBMS instead of Sqoop. Data would be substantially large in the order of billions of records. I am not sure reading the documentations whether jdbcRDD by design is going to be able to scale well for this