> so i can not use default scan() constructor as it will scan whole
table in one go which results in OutOfMemory error in client process

Not getting what you mean by this. Client calls next() on the Scanner and
gets the rows. The setCaching() and setBatch() determines how much of data
(rows, cells) will get retrieved from RS to client in one next() call to
server.  So if caching is set as 100 you will be having 100 rows in the
ClientScanner cache. Which version you are using? In older versions the
caching default value was 1 only. Later it is changed to 100 .


-Anoop-

On Fri, Jun 28, 2013 at 4:02 AM, Michael Segel <michael_se...@hotmail.com>wrote:

> Phoenix, Hive, Pig, Java would all work.
> But to Azury Yu's post...
>
> The OP is doing a simple scan() to get rows.
> If the OP is hitting an OOM exception then its a code issue on the part of
> the OP.
>
>
> On Jun 27, 2013, at 2:22 AM, Azuryy Yu <azury...@gmail.com> wrote:
>
> > Sorry, maybe Phonex is not suitable for you.
> >
> >
> > On Thu, Jun 27, 2013 at 3:21 PM, Azuryy Yu <azury...@gmail.com> wrote:
> >
> >> 1) Scan.setCaching() to specify the number of rows for caching that will
> >> be passed to scanners.
> >>    and what's your block cache size?
> >>
> >>    but if OOM from the client, not sever side, then I don't think this
> is
> >> Scan related, please check your client code.
> >>
> >> 2) we cannot add default value from HBase,  but you can add it on your
> >> client when iterate the Result.
> >>
> >> Also, you can using Phonex, this is cool for your scenario.
> >> https://github.com/forcedotcom/phoenix
> >>
> >>
> >>
> >> On Thu, Jun 27, 2013 at 3:11 PM, Vimal Jain <vkj...@gmail.com> wrote:
> >>
> >>> Hi,
> >>> I am trying to export from hbase to a CSV file.
> >>> I am using "Scan" class to scan all data  in the table.
> >>> But i am facing some problems while doing it.
> >>>
> >>> 1) My table has around 1.5 million rows  and around 150 columns for
> each
> >>> row , so i can not use default scan() constructor as it will scan whole
> >>> table in one go which results in OutOfMemory error in client process.I
> >>> heard of using setCaching() and setBatch() but i am not able to
> understand
> >>> how it will solve OOM error.
> >>>
> >>> I thought of providing startRow and stopRow in scan object but i want
> to
> >>> scan whole table so how will this help ?
> >>>
> >>> 2) As hbase stores data for a row only when we explicitly provide it
> and
> >>> their is no concept of default value as found in RDBMS , i want to have
> >>> each and evey column in the CSV file i generate for every user.In case
> >>> column values are not there in hbase , i want to use default  values
> for
> >>> them(I have list of default values for each column). Is there any
> method
> >>> in
> >>> Result class or any other class to accomplish this ?
> >>>
> >>>
> >>> Please help here.
> >>>
> >>> --
> >>> Thanks and Regards,
> >>> Vimal Jain
> >>>
> >>
> >>
>
>

Reply via email to