Re: Need help in SparkSQL
Another typical solution is build a search using elasticsearch and use it as secondary index for hbase On 23 Jul 2015 15:50, "Jörn Franke" wrote: > I do not think you can put all your queries into the row key without > duplicating the data for each query. However, this would be more last > resort. > > Have you checked out phoenix for Hbase? This might suit your needs. It > makes it much simpler, because it provided sql on top of Hbase. > > Nevertheless, Hive could also be a viable alternative depending on how > often you run queries etc > > Le jeu. 23 juil. 2015 à 7:14, Jeetendra Gangele a > écrit : > >> Query will be something like that >> >> 1. how many users visited 1 BHK flat in last 1 hour in given particular >> area >> 2. how many visitor for flats in give area >> 3. list all user who bought given property in last 30 days >> >> Further it may go too complex involving multiple parameters in my query. >> >> The problem is HBase is designing row key to get this data efficiently. >> >> Since I have multiple fields to query upon base may not be a good choice? >> >> i dont dont to iterate the result set which Hbase returns and give the >> result because this will kill the performance? >> >> On 23 July 2015 at 01:02, Jörn Franke wrote: >> >>> Can you provide an example of an and query ? If you do just look-up you >>> should try Hbase/ phoenix, otherwise you can try orc with storage index >>> and/or compression, but this depends on how your queries look like >>> >>> Le mer. 22 juil. 2015 à 14:48, Jeetendra Gangele >>> a écrit : >>> HI All, I have data in MongoDb(few TBs) which I want to migrate to HDFS to do complex queries analysis on this data.Queries like AND queries involved multiple fields So my question in which which format I should store the data in HDFS so that processing will be fast for such kind of queries? Regards Jeetendra >> >> >> -- >> Hi, >> >> Find my attached resume. I have total around 7 years of work experience. >> I worked for Amazon and Expedia in my previous assignments and currently >> I am working with start- up technology company called Insideview in >> hyderabad. >> >> Regards >> Jeetendra >> >
Re: Need help in SparkSQL
I do not think you can put all your queries into the row key without duplicating the data for each query. However, this would be more last resort. Have you checked out phoenix for Hbase? This might suit your needs. It makes it much simpler, because it provided sql on top of Hbase. Nevertheless, Hive could also be a viable alternative depending on how often you run queries etc Le jeu. 23 juil. 2015 à 7:14, Jeetendra Gangele a écrit : > Query will be something like that > > 1. how many users visited 1 BHK flat in last 1 hour in given particular > area > 2. how many visitor for flats in give area > 3. list all user who bought given property in last 30 days > > Further it may go too complex involving multiple parameters in my query. > > The problem is HBase is designing row key to get this data efficiently. > > Since I have multiple fields to query upon base may not be a good choice? > > i dont dont to iterate the result set which Hbase returns and give the > result because this will kill the performance? > > On 23 July 2015 at 01:02, Jörn Franke wrote: > >> Can you provide an example of an and query ? If you do just look-up you >> should try Hbase/ phoenix, otherwise you can try orc with storage index >> and/or compression, but this depends on how your queries look like >> >> Le mer. 22 juil. 2015 à 14:48, Jeetendra Gangele >> a écrit : >> >>> HI All, >>> >>> I have data in MongoDb(few TBs) which I want to migrate to HDFS to do >>> complex queries analysis on this data.Queries like AND queries involved >>> multiple fields >>> >>> So my question in which which format I should store the data in HDFS so >>> that processing will be fast for such kind of queries? >>> >>> >>> Regards >>> Jeetendra >>> >>> > > > -- > Hi, > > Find my attached resume. I have total around 7 years of work experience. > I worked for Amazon and Expedia in my previous assignments and currently I > am working with start- up technology company called Insideview in hyderabad. > > Regards > Jeetendra >
Re: Need help in SparkSQL
Query will be something like that 1. how many users visited 1 BHK flat in last 1 hour in given particular area 2. how many visitor for flats in give area 3. list all user who bought given property in last 30 days Further it may go too complex involving multiple parameters in my query. The problem is HBase is designing row key to get this data efficiently. Since I have multiple fields to query upon base may not be a good choice? i dont dont to iterate the result set which Hbase returns and give the result because this will kill the performance? On 23 July 2015 at 01:02, Jörn Franke wrote: > Can you provide an example of an and query ? If you do just look-up you > should try Hbase/ phoenix, otherwise you can try orc with storage index > and/or compression, but this depends on how your queries look like > > Le mer. 22 juil. 2015 à 14:48, Jeetendra Gangele a > écrit : > >> HI All, >> >> I have data in MongoDb(few TBs) which I want to migrate to HDFS to do >> complex queries analysis on this data.Queries like AND queries involved >> multiple fields >> >> So my question in which which format I should store the data in HDFS so >> that processing will be fast for such kind of queries? >> >> >> Regards >> Jeetendra >> >> -- Hi, Find my attached resume. I have total around 7 years of work experience. I worked for Amazon and Expedia in my previous assignments and currently I am working with start- up technology company called Insideview in hyderabad. Regards Jeetendra
RE: Need help in SparkSQL
Parquet Mohammed From: Jeetendra Gangele [mailto:gangele...@gmail.com] Sent: Wednesday, July 22, 2015 5:48 AM To: user Subject: Need help in SparkSQL HI All, I have data in MongoDb(few TBs) which I want to migrate to HDFS to do complex queries analysis on this data.Queries like AND queries involved multiple fields So my question in which which format I should store the data in HDFS so that processing will be fast for such kind of queries? Regards Jeetendra
Re: Need help in SparkSQL
Can you provide an example of an and query ? If you do just look-up you should try Hbase/ phoenix, otherwise you can try orc with storage index and/or compression, but this depends on how your queries look like Le mer. 22 juil. 2015 à 14:48, Jeetendra Gangele a écrit : > HI All, > > I have data in MongoDb(few TBs) which I want to migrate to HDFS to do > complex queries analysis on this data.Queries like AND queries involved > multiple fields > > So my question in which which format I should store the data in HDFS so > that processing will be fast for such kind of queries? > > > Regards > Jeetendra > >