HBase MR - key/value mismatch

2013-09-05 Thread Omkar Joshi
I'm trying to execute a MR code over stand-alone HBase(0.94.11). I had read the HBase api and modified my MR code to read data and getting exceptions in the Reduce phase. The exception I get is : 13/09/05 16:16:17 INFO mapred.JobClient: map 0% reduce 0% 13/09/05 16:23:31 INFO

Re: Suggestion need on desinging Flatten table for HBase given scenario

2013-09-05 Thread Ted Yu
The attachment in your original email didn't go through. Please put it on some website so that everyone can see it. Thanks On Sep 4, 2013, at 10:24 PM, Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.com wrote: Hi Have shared to you in Google + Can't you see that

Re: HBase MR - key/value mismatch

2013-09-05 Thread Shahab Yunus
Try using Bytes.toBytes(your string) rather than String.getBytes. Regards, Shahab On Thu, Sep 5, 2013 at 2:16 AM, Omkar Joshi omkar.jo...@lntinfotech.comwrote: I'm trying to execute a MR code over stand-alone HBase(0.94.11). I had read the HBase api and modified my MR code to read data and

Re: user action modeling

2013-09-05 Thread Shahab Yunus
Your read queries seem to be more driven form the 'action' and 'object' perspective, rather than user. 1- So one option is that you make a composite key with action and object: action|object and the columns are users who are generating events on this combination. You can scan using prefix filter

Programming practices for implementing composite row keys

2013-09-05 Thread praveenesh kumar
Hello people, I have a scenario which requires creating composite row keys for my hbase table. Basically it would be entity1,entity2,entity3. Search would be based by entity1 and then entity2 and 3.. I know I can do row start-stopscan on entity1 first and then put row filters on entity2 and

Re: HBase MR - key/value mismatch

2013-09-05 Thread Ted Yu
public class SentimentCalculationHBaseReducer extends TableReducerText, Text, ImmutableBytesWritable { The first type parameter for reducer should be ImmutableBytesWritable Cheers On Wed, Sep 4, 2013 at 11:16 PM, Omkar Joshi omkar.jo...@lntinfotech.comwrote: I'm trying to execute a

Re: user action modeling

2013-09-05 Thread Marcos Sousa
Hi, Yes, that the point, I need to save dynamic parameters for each action :( I was thinking about, distributing the data in 3 tables: - users: which I have all data about user and the list of friends and documents that he performed the action - user_actions: to save action and futher

Re: Programming practices for implementing composite row keys

2013-09-05 Thread Ted Yu
For #2 and #4, see HBASE-8693 'DataType: provide extensible type API' which has been integrated to 0.96 Cheers On Thu, Sep 5, 2013 at 7:14 AM, Shahab Yunus shahab.yu...@gmail.com wrote: My 2 cents: 1- Yes, that is one way to do it. You can also use fixed length for every attribute

Re: Programming practices for implementing composite row keys

2013-09-05 Thread Shahab Yunus
Ah! I didn't know about HBASE-8693. Good information. Thanks Ted. Regards, Shahab On Thu, Sep 5, 2013 at 10:53 AM, Ted Yu yuzhih...@gmail.com wrote: For #2 and #4, see HBASE-8693 'DataType: provide extensible type API' which has been integrated to 0.96 Cheers On Thu, Sep 5, 2013 at 7:14

Re: HBase MR - key/value mismatch

2013-09-05 Thread Ted Yu
The reducer also serves as combiner whose output would be sent to reducer. org.apache.hadoop.mapreduce.ReducerText, Text, ImmutableBytesWritable, org.apache.hadoop.io.Writable.Context context) So the type parameters above should facilitate this. Take a look at the PutCombiner from

Concurrent connections to Hbase

2013-09-05 Thread Kiru Pakkirisamy
Hi All, I'd like to hear from users who are running a  big HBase setup with multiple concurrent connections. Woud like to know the -# of cores/machines, # of queries. Get/RPCs , Hbase version etc. We are trying to build an application with sub-second query performance (using coprocessors)  and

Re: HBase MR - key/value mismatch

2013-09-05 Thread Shahab Yunus
Ted, Might be a something very basic that I am missing but why should OP's reducer's key be of type ImmutableBytesWritable if he is emitting Text in the mapper? Thanks. protected void map( ImmutableBytesWritable key, Result value,

Re: Concurrent connections to Hbase

2013-09-05 Thread James Taylor
Hey Kiru, The Phoenix team would be happy to work with you to benchmark your performance if you can give us specifics about your schema design, queries, and data sizes. We did something similar for Sudarshan for a Bloomberg use case here[1]. Thanks, James [1].

[ANN]: HBase-Writer 0.94.0 available for download

2013-09-05 Thread R Smith
The HBase-Writer team is happy to announce that HBase-Writer 0.94.0 is available for download: http://code.google.com/p/hbase-writer/downloads/list HBase-Writer 0.94.0 is a maintenance release that fixes library compatibility since older versions of Heritrix and HBase. More details may be

FILE_BYTES_READ counter missing for HBase mapreduce job

2013-09-05 Thread Haijia Zhou
Hi, Basically I have a mapreduce job to scan a hbase table and do some processing. After the job finishes, I only got three filesystem counters: HDFS_BYTES_READ, HDFS_BYTES_WRITTEN and FILE_BYTES_WRITTEN. The value of HDFS_BYTES_READ is not very useful here because it shows the size of the .META

Re: FILE_BYTES_READ counter missing for HBase mapreduce job

2013-09-05 Thread Haijia Zhou
Addition info: The mapreduce job I run is a map-only job. It does not have reducers and it write data directly to hdfs in the mapper. Could this be the reason why there's no value for file_bytes_read? If so, is there any easy way to get the total input data size? Thanks Haijia On Thu, Sep 5,

Re: Programming practices for implementing composite row keys

2013-09-05 Thread Doug Meil
Greetings, Other food for thought on some case studies on composite rowkey design are in the refguide: http://hbase.apache.org/book.html#schema.casestudies On 9/5/13 12:15 PM, Anoop John anoop.hb...@gmail.com wrote: Hi Have a look at Phoenix[1]. There you can define a

Re: Suggestion need on desinging Flatten table for HBase given scenario

2013-09-05 Thread Doug Meil
Greetings, The refguide has some case studies on composite rowkey design that might be helpful. http://hbase.apache.org/book.html#schema.casestudies From: Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.commailto:ramasubramanian.naraya...@gmail.com Reply-To:

Re: Programming practices for implementing composite row keys

2013-09-05 Thread Anoop John
Hi Have a look at Phoenix[1]. There you can define a composite RK model and it handles the -ve number ordering. Also the scan model u mentioned will be well supported with start/stop RK on entity1 and using SkipScanFilter for others. -Anoop- [1] https://github.com/forcedotcom/phoenix

what's different between numberOfStores and numberOfStorefiles in region status variables

2013-09-05 Thread ch huang
hi all: i check the region server status though http://IP:60030/rs-statushttp://ip:60030/rs-status and see for some region ,the two variables is not always same,i wonder that what's different between them? numberOfStores=1, numberOfStorefiles=3

Re: what's different between numberOfStores and numberOfStorefiles in region status variables

2013-09-05 Thread lars hofhansl
Each column family is a store (in fact there is one store per region and column families). Each region may have more than one actual HFile per store. A new HFile (storefile) is create for example when the memstore is flushed to disk. When too many HFiles have accumulated for a store, they are