Yes. Specify a column family or a column family + column qualifier to load less than total row.
St.Ack On Wed, Mar 10, 2010 at 11:36 PM, William Kang <weliam.cl...@gmail.com> wrote: > Hi, > I have another question. If we do things like following: > > "Get g = new Get(Bytes.toBytes("rowname")); > > > Result r = table.get(g);" > > > Will HBase load the entire row into memory? Thanks. > > > > William > > > On Tue, Mar 9, 2010 at 1:26 AM, Stack <st...@duboce.net> wrote: > >> On Mon, Mar 8, 2010 at 6:58 PM, William Kang <weliam.cl...@gmail.com> >> wrote: >> > Hi, >> > Can you give me some more details about how the information in a row can >> be >> > fetched? I understand that a file like 1.5 G may have multiple HFiles in >> a >> > region server. If the client want to access a column label value in that >> > row, what is going to happen? >> >> Only that cell is fetched if you specify an explicity column name >> (column family + qualifier). >> >> After HBase found the region store this row, >> > it goes to region .meta and find the index of the HFile that store the >> > column family. And the HFile has the offset of keyvalue pairs. Then HBase >> > can go to the keyvalue pair and get the value for a certain column label. >> > >> >> Yes. >> >> >> > Why the whole row needs to be read in memory? >> > >> >> If you ask for the whole row, it will try to load it all to deliver it >> all to you. There is no "streaming" api per se. Rather a Result >> object is passed from server to client which has in it all in a row >> keyed by column name. >> >> That said, if you want the whole row and you are scanning as opposed >> to getting, TRUNK has hbase-1537 applied which allows for intra-row >> scanning -- you call setBatch to set maximum returned within a row and >> the 0.20 branch has HBASE-1996, which allows you set maximum size >> returned on a next invocation (in both cases, if the row is not >> exhausted, the next 'next' invocation will return more out of the >> current row, and so on, until the row is exhausted). >> >> > If HBase does not read the whole row at once, what caused its >> inefficiency? >> >> I think Ryan is just allowing that the above means of scanning parts >> of rows may have bugs that we've not yet squashed. >> >> St.Ack >> >> >> > Thanks. >> > >> > >> > William >> > >> > On Mon, Mar 8, 2010 at 3:44 PM, Ryan Rawson <ryano...@gmail.com> wrote: >> > >> >> Hi, >> >> >> >> At this time, truly massive massive rows such as the one you described >> >> may behave non-optimally in hbase. While in previous versions of >> >> HBase, reading an entire row required you to be able to actually read >> >> and send the entire row in one go, there is a new API that allows you >> >> to get effectively stream rows. There are still some read paths that >> >> may read more data than necessary, so your performance milage may >> >> vary. >> >> >> >> >> >> >> >> On Sun, Mar 7, 2010 at 3:56 AM, Ahmed Suhail Manzoor >> >> <suhail...@gmail.com> wrote: >> >> > Hi, >> >> > >> >> > This might prove to be a blatantly obvious questions but wouldn't it >> make >> >> > sense to store large files directly in HDFS and keep the metadata >> about >> >> the >> >> > file in HBase? One could for instance serialize set the details of the >> >> hdfs >> >> > file in a java object and store that in hbase. This object could >> export >> >> the >> >> > reading of the hdfs file for instance so that one is left with clean >> >> code. >> >> > Anything wrong in implementing things this way? >> >> > >> >> > Cheers >> >> > su./hail >> >> > >> >> > On 07/03/2010 09:21, tsuna wrote: >> >> >> >> >> >> On Sat, Mar 6, 2010 at 9:14 PM, steven zhuang >> >> >> <steven.zhuang.1...@gmail.com> wrote: >> >> >> >> >> >>> >> >> >>> I have a table which may contain super big rows, e.g. with >> >> >>> millions of cells in one row, 1.5GB in size. >> >> >>> >> >> >>> now I have problem at emitting data into the table, >> probably >> >> >>> because of these super big rows are too large for my >> regionserver(with >> >> >>> only >> >> >>> 1GB heap) >> >> >>> >> >> >> >> >> >> A row can't be split and whatever you do that needs that row (like >> >> >> reading it) requires that HBase loads the entire row in memory. If >> >> >> the row is 1.5GB and your regionserver has only 1G of memory, it >> won't >> >> >> be able to use that row. >> >> >> >> >> >> I'm not 100% sure about that because I'm still a HBase n00b too, but >> >> >> that's my understanding. >> >> >> >> >> >> >> >> > >> >> > >> >> >> > >> >