Re: Help with Map/Reduce program

llpind Wed, 10 Jun 2009 22:12:56 -0700


When is the plan for releasing .20?  This particular issue is really
important to us.

Stack, I also have another question: The problem we are trying to solve
doesn't really need the extra layer present in HBase (BigTable) structure
(RowResult holds row key and a HashMap of column name, value). What we
really need is a row key which simply holds a set of values.  Essentially
this is a many-to-many.  I wanted your thoughts on how we can go about
solving this problem (we can start another post for this if you’d like). Is
this something HBase can solve, or something that could potentially be a
HBase fork?  Right now we are still in test mode, and only having to deal
with millions of columns, but in production (if the company sticks with
HBase) the columns could be in the billions.  One idea we came up with is to
have an overflow table… e.g.

For a given row key we list the first 10,000 columns (values in our case),
and after that we create a column with an overflow id pointing an overflow
table which is keyed on this id.

This appears it may work, but isn’t the most elegant solution.  I’d
appreciate input from anyone on this issue.   Please, let me know if you
need me explain our problem in more detail. 

stack-3 wrote:
> 
> On Wed, Jun 10, 2009 at 4:52 PM, llpind <[email protected]> wrote:
> 
>>
>> Thanks.  I think the problem is I have potentially millions of columns.
>>
> 
>> where a given RowResult can hold millions of columns to values.   Thats
>> why
>> Map/Reduce is having problems as well (Java Heap exception).  I've upped
>> mapred.child.java.opts, but problem presists.
>>
> 
> See also HBASE-867: https://issues.apache.org/jira/browse/HBASE-867
> St.Ack
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Help-with-Map-Reduce-program-tp23952252p23975405.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: Help with Map/Reduce program

Reply via email to