> This facility is not exposed in the REST API at the moment > (not that I know of -- please someone correct me if I'm > wrong).
Wrong. :-) See ScannerModel in the rest package: http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/rest/model/ScannerModel.html ScannerModel#setBatch - Andy --- On Wed, 3/16/11, Stack <[email protected]> wrote: > From: Stack <[email protected]> > Subject: Re: habse schema design and retrieving values through REST interface > To: [email protected] > Date: Wednesday, March 16, 2011, 10:47 AM > You can limit the return when > scanning from the java api; see > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setBatch(int) > This facility is not exposed in the REST API at the moment > (not that > I know of -- please someone correct me if I'm > wrong). So, yes, wide > rows, if thousands of elements of some size, since they > need to be > composed all in RAM, could bring on an OOME if the composed > size > > available heap. > > St.Ack > > > On Wed, Mar 16, 2011 at 2:41 AM, sreejith P. K. <[email protected]> > wrote: > > With this schema, if i can limit the column family > over a particular range, > > I can manage everything else. (like Select first n > columns of a column > > family) > > > > Sreejith > > > > > > On Wed, Mar 16, 2011 at 12:33 PM, sreejith P. K. > <[email protected]>wrote: > > > >> @ Jean-Daniel, > >> > >> As i told, each row key contains thousands of > column family values (may be > >> i am wrong with the schema design). I started REST > and tried to cURL > >> http:/localhost/tablename/rowname. It seems it > will work only with limited > >> amount of data (may be i can limit the cURL > output), and how i can limit the > >> column values for a particular row? > >> Suppose i have two thousand urls under a keyword > and i need to fetch the > >> urls and should limit the result to five hundred. > How it is possible?? > >> > >> @ tsuna, > >> > >> It seems http://www.elasticsearch.org/ using > CouchDB right? > >> > >> > >> On Tue, Mar 15, 2011 at 11:32 PM, Jean-Daniel > Cryans <[email protected]>wrote: > >> > >>> Can you tell why it's not able to get the > bigger rows? Why would you > >>> try another schema if you don't even know > what's going on right now? > >>> If you have the same issue with the new > schema, you're back to square > >>> one right? > >>> > >>> Looking at the logs should give you some > hints. > >>> > >>> J-D > >>> > >>> On Tue, Mar 15, 2011 at 10:19 AM, sreejith P. > K. <[email protected]> > >>> wrote: > >>> > Hello experts, > >>> > > >>> > I have a scenario as follows, > >>> > I need to maintain a huge table for a > 'web crawler' project in HBASE. > >>> > Basically it contains thousands of > keywords and for each keyword i need > >>> to > >>> > maintain a list of urls (it again will > count in thousands). > >>> Corresponding to > >>> > each url, i need to store a number, which > will in turn resemble the > >>> priority > >>> > value the keyword holds. > >>> > Let me explain you a bit, Suppose i have > a keyword 'united states', i > >>> need > >>> > to store about ten thousand urls > corresponding to that keyword. Each > >>> keyword > >>> > will be holding a priority value which is > an integer. Again i have > >>> thousands > >>> > of keywords like that. The rare thing > about this is i need to do the > >>> project > >>> > in PHP. > >>> > > >>> > I have configured a hadoop-hbase cluster > consists of three machines. My > >>> plan > >>> > was to design the schema by taking the > keyword as 'row key'. The urls i > >>> will > >>> > keep as column family. The schema looked > fine at first. I have done a > >>> lot of > >>> > research on how to retrieve the url list > if i know the keyword. Any ways > >>> i > >>> > managed a way out by preg-matching the > xml data out put using the url > >>> > http://localhost:8080/tablename/rowkey (REST interface > i used). It also > >>> > works fine if the url list has a limited > number of urls. When it comes > >>> in > >>> > thousands, it seems i cannot fetch the > xml data itself! > >>> > Now I am in a do or die situation. Please > correct me if my schema design > >>> > needs any changes (I do believe it should > change!) and please help me up > >>> to > >>> > retrieve the column family values (urls) > >>> > corresponding to each row-key in an > efficient way. Please guide me how > >>> i > >>> > can do the same using PHP-REST > interface. > >>> > Thanks in advance. > >>> > > >>> > Sreejith > >>> > > >>> > >> > >> > >> > >> -- > >> Sreejith PK > >> Nesote Technologies (P) Ltd > >> > >> > >> > > > > > > -- > > Sreejith PK > > Nesote Technologies (P) Ltd > > >
