Thank you Andrew. St.Ack
On Wed, Mar 16, 2011 at 3:12 PM, Andrew Purtell <[email protected]> wrote: >> This facility is not exposed in the REST API at the moment >> (not that I know of -- please someone correct me if I'm >> wrong). > > Wrong. :-) > > See ScannerModel in the rest package: > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/rest/model/ScannerModel.html > > ScannerModel#setBatch > > - Andy > > > > --- On Wed, 3/16/11, Stack <[email protected]> wrote: > >> From: Stack <[email protected]> >> Subject: Re: habse schema design and retrieving values through REST interface >> To: [email protected] >> Date: Wednesday, March 16, 2011, 10:47 AM >> You can limit the return when >> scanning from the java api; see >> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setBatch(int) >> This facility is not exposed in the REST API at the moment >> (not that >> I know of -- please someone correct me if I'm >> wrong). So, yes, wide >> rows, if thousands of elements of some size, since they >> need to be >> composed all in RAM, could bring on an OOME if the composed >> size > >> available heap. >> >> St.Ack >> >> >> On Wed, Mar 16, 2011 at 2:41 AM, sreejith P. K. <[email protected]> >> wrote: >> > With this schema, if i can limit the column family >> over a particular range, >> > I can manage everything else. (like Select first n >> columns of a column >> > family) >> > >> > Sreejith >> > >> > >> > On Wed, Mar 16, 2011 at 12:33 PM, sreejith P. K. >> <[email protected]>wrote: >> > >> >> @ Jean-Daniel, >> >> >> >> As i told, each row key contains thousands of >> column family values (may be >> >> i am wrong with the schema design). I started REST >> and tried to cURL >> >> http:/localhost/tablename/rowname. It seems it >> will work only with limited >> >> amount of data (may be i can limit the cURL >> output), and how i can limit the >> >> column values for a particular row? >> >> Suppose i have two thousand urls under a keyword >> and i need to fetch the >> >> urls and should limit the result to five hundred. >> How it is possible?? >> >> >> >> @ tsuna, >> >> >> >> It seems http://www.elasticsearch.org/ using >> CouchDB right? >> >> >> >> >> >> On Tue, Mar 15, 2011 at 11:32 PM, Jean-Daniel >> Cryans <[email protected]>wrote: >> >> >> >>> Can you tell why it's not able to get the >> bigger rows? Why would you >> >>> try another schema if you don't even know >> what's going on right now? >> >>> If you have the same issue with the new >> schema, you're back to square >> >>> one right? >> >>> >> >>> Looking at the logs should give you some >> hints. >> >>> >> >>> J-D >> >>> >> >>> On Tue, Mar 15, 2011 at 10:19 AM, sreejith P. >> K. <[email protected]> >> >>> wrote: >> >>> > Hello experts, >> >>> > >> >>> > I have a scenario as follows, >> >>> > I need to maintain a huge table for a >> 'web crawler' project in HBASE. >> >>> > Basically it contains thousands of >> keywords and for each keyword i need >> >>> to >> >>> > maintain a list of urls (it again will >> count in thousands). >> >>> Corresponding to >> >>> > each url, i need to store a number, which >> will in turn resemble the >> >>> priority >> >>> > value the keyword holds. >> >>> > Let me explain you a bit, Suppose i have >> a keyword 'united states', i >> >>> need >> >>> > to store about ten thousand urls >> corresponding to that keyword. Each >> >>> keyword >> >>> > will be holding a priority value which is >> an integer. Again i have >> >>> thousands >> >>> > of keywords like that. The rare thing >> about this is i need to do the >> >>> project >> >>> > in PHP. >> >>> > >> >>> > I have configured a hadoop-hbase cluster >> consists of three machines. My >> >>> plan >> >>> > was to design the schema by taking the >> keyword as 'row key'. The urls i >> >>> will >> >>> > keep as column family. The schema looked >> fine at first. I have done a >> >>> lot of >> >>> > research on how to retrieve the url list >> if i know the keyword. Any ways >> >>> i >> >>> > managed a way out by preg-matching the >> xml data out put using the url >> >>> > http://localhost:8080/tablename/rowkey (REST interface >> i used). It also >> >>> > works fine if the url list has a limited >> number of urls. When it comes >> >>> in >> >>> > thousands, it seems i cannot fetch the >> xml data itself! >> >>> > Now I am in a do or die situation. Please >> correct me if my schema design >> >>> > needs any changes (I do believe it should >> change!) and please help me up >> >>> to >> >>> > retrieve the column family values (urls) >> >>> > corresponding to each row-key in an >> efficient way. Please guide me how >> >>> i >> >>> > can do the same using PHP-REST >> interface. >> >>> > Thanks in advance. >> >>> > >> >>> > Sreejith >> >>> > >> >>> >> >> >> >> >> >> >> >> -- >> >> Sreejith PK >> >> Nesote Technologies (P) Ltd >> >> >> >> >> >> >> > >> > >> > -- >> > Sreejith PK >> > Nesote Technologies (P) Ltd >> > >> > > > >
