from:"Suniti Singh"

aggregateByKey on PairRDD

2016-03-29 Thread Suniti Singh

Hi All, I have an RDD having the data in the following form : tempRDD: RDD[(String, (String, String))] (brand , (product, key)) ("amazon",("book1","tech")) ("eBay",("book1","tech")) ("barns&noble",("book","tech")) ("amazon",("book2","tech")) I would like to group the data by Brand and wou

Re: Compare a column in two different tables/find the distance between column data

2016-03-15 Thread Suniti Singh

further processing. I am kind of stuck. On Tue, Mar 15, 2016 at 10:50 AM, Suniti Singh wrote: > Is it always the case that one title is a substring of another ? -- Not > always. One title can have values like D.O.C, doctor_{areacode}, > doc_{dep,areacode} > > On Mon, Mar 14, 2

Re: Compare a column in two different tables/find the distance between column data

2016-03-15 Thread Suniti Singh

at one title is a substring of another ? > > On Tue, Mar 15, 2016 at 6:46 AM, Suniti Singh > wrote: > >> Hi All, >> >> I have two tables with same schema but different data. I have to join the >> tables based on one column and then do a group by the same column name

Compare a column in two different tables/find the distance between column data

2016-03-14 Thread Suniti Singh

Hi All, I have two tables with same schema but different data. I have to join the tables based on one column and then do a group by the same column name. now the data in that column in two table might/might not exactly match. (Ex - column name is "title". Table1. title = "doctor" and Table2. ti

Re: spark 1.6.0 connect to hive metastore

2016-03-09 Thread Suniti Singh

hive 1.6.0 in embed mode doesn't connect to metastore -- https://issues.apache.org/jira/browse/SPARK-9686 https://forums.databricks.com/questions/6512/spark-160-not-able-to-connect-to-hive-metastore.html On Wed, Mar 9, 2016 at 10:48 AM, Suniti Singh wrote: > Hi, > > I am abl

Re: spark 1.6.0 connect to hive metastore

2016-03-09 Thread Suniti Singh

Hi, I am able to reproduce this error only when using spark 1.6.0 and hive 1.6.0. The hive-site.xml is in the classpath but somehow spark rejects the classpath search for hive-site.xml and start using the default metastore Derby. 16/03/09 10:37:52 INFO MetaStoreDirectSql: Using direct SQL, underl

Re: Using dynamic allocation and shuffle service in Standalone Mode

2016-03-08 Thread Suniti Singh

Please check the document for the configuration - http://spark.apache.org/docs/latest/job-scheduling.html#configuration-and-setup On Tue, Mar 8, 2016 at 10:14 AM, Silvio Fiorito < silvio.fior...@granturing.com> wrote: > You’ve started the external shuffle service on all worker nodes, correct? >

Re: Adding hive context gives error

2016-03-07 Thread Suniti Singh

Hi Suniti, > > why are you mixing spark-sql version 1.2.0 with spark-core, spark-hive v > 1.6.0? > > I’d suggest you try to keep all the libs at the same version. > > On Mar 7, 2016, at 6:15 PM, Suniti Singh wrote: > > > > org.apache.spa

Re: Adding hive context gives error

2016-03-07 Thread Suniti Singh

OABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 8 March 2016 at 00:45, Suniti Singh wrote: > >> Thanks Mich and Kabeer for quick reply. >> >> @

Re: Adding hive context gives error

2016-03-07 Thread Suniti Singh

> > > > However, I do note that you are using Spark-sql include and the Spark > version you use is 1.6.0. Can you please try with 1.5.0 to see if it works? > I havent yet tried Spark 1.6.0. > > > On 08/03/16 00:15, Suniti Singh wrote: > > Hi All, > > I am tr

Adding hive context gives error

2016-03-07 Thread Suniti Singh

Hi All, I am trying to create a hive context in a scala prog as follows in eclipse: Note -- i have added the maven dependency for spark -core , hive , and sql. import org.apache.spark.SparkConf import org.apache.spark.SparkContext import org.apache.spark.rdd.RDD.rddToPairRDDFunctions object D

aggregateByKey on PairRDD

Re: Compare a column in two different tables/find the distance between column data

Re: Compare a column in two different tables/find the distance between column data

Compare a column in two different tables/find the distance between column data

Re: spark 1.6.0 connect to hive metastore

Re: spark 1.6.0 connect to hive metastore

Re: Using dynamic allocation and shuffle service in Standalone Mode

Re: Adding hive context gives error

Re: Adding hive context gives error

Re: Adding hive context gives error

Adding hive context gives error

11 matches

Site Navigation

Mail list logo

Footer information