Re: Running an jython import job

Andrew Nguyen Fri, 23 Jul 2010 10:18:35 -0700

St.Ack,

Thanks for the clarification - I just wanted to get confirmation that debug 
messages were just debug and not potentially indicative of something starting 
to go wrong.  I happened to be looking at the DFS usage % and it keeps going up 
and down (I figured it should only be increasing) so that got me looking at the 
job's log...

The jython page on the wiki was extremely useful.  I actually had never used 
jython before but am a big fan of python for getting stuff up quickly so it 
seemed to be a natural progression.  Having said that, I am looking at 
importing a ton of rows (not sure how much but hundreds of millions to 
billions).  Are there any good examples on doing this as efficiently as 
possible?  And, how does jython compare to a pure Java approach?

Currently, I have a for loop just calling table.put(p) repeatedly.  I also have 
WAL disabled, autoflush set to false, and increased the buffer.  Anything else 
I should consider?

Thanks!

--Andrew

--
Andrew Nguyen
[email protected]

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain confidential or privileged information.  Any unauthorized review, 
dissemination, distribution, or copying of this communication is prohibited.  
If you are not the intended recipient, please notify the sender immediately by 
reply e-mail, and destroy all copies of this message and any attachments from 
your files.

On Jul 23, 2010, at 10:05 AM, Stack wrote:

> This is just our noisy client talking about the caching of region
> locations out on the cluster (You are at DEBUG level).  Turn off DEBUG
> in client if you'd rather not see the messages -- see the FAQ for how
> -- or just ignore.  When they turn WARN or ERROR, start paying
> attention.
> 
> Did they jython page up on wiki help?
> Yours,
> St.Ack
> 
> On Fri, Jul 23, 2010 at 9:58 AM, Andrew Nguyen
> <[email protected]> wrote:
>> Hello all,
>> 
>> I am running a job from jython that is importing time series data into 
>> HBase.  I started to see the following messages and wanted to dive deeper to 
>> find out if they are true errors or just debug messages:
>> 
>> 10/07/23 09:51:07 DEBUG client.HConnectionManager$TableServers: Reloading 
>> region subset,a40506-2016/07/23-20:33:30.296,1279902520534 location because 
>> regionserver didn't accept updates; tries=0 of max=10, waiting=1000ms
>> 10/07/23 09:51:08 DEBUG client.HConnectionManager$TableServers: Cached 
>> location for .META.,,1 is 10.10.11.3:60020
>> 10/07/23 09:51:08 DEBUG client.HConnectionManager$TableServers: 
>> locateRegionInMeta attempt 0 of 10 failed; retrying after sleep of 1000 
>> because: No server address listed in .META. for region 
>> subset,a40506-2016/07/24-07:00:35.528,1279903897169
>> 10/07/23 09:51:09 DEBUG client.HConnectionManager$TableServers: Cached 
>> location for subset,a40506-2016/07/24-07:00:35.528,1279903897169 is 
>> 10.10.11.2:60020
>> 
>> I did some searches on google and this seems to point at the potential lack 
>> of memory.  Currently, HBase is setup with a heap of 2G for each slave, and 
>> there are 6 slaves.  Each slave has a total of 8G of RAM installed.  If you 
>> guys have any guidance on what other settings I should look for, please let 
>> me know.
>> 
>> Thanks!
>> 
>> --Andrew
>> 
>> --
>> Andrew Nguyen
>> [email protected]
>> 
>> The information contained in this electronic message and any attachments to 
>> this message are intended for the exclusive use of the addressee(s) and may 
>> contain confidential or privileged information.  Any unauthorized review, 
>> dissemination, distribution, or copying of this communication is prohibited. 
>>  If you are not the intended recipient, please notify the sender immediately 
>> by reply e-mail, and destroy all copies of this message and any attachments 
>> from your files.
>> 
>> 
>> 
>> 
>> 
>> 
>>

Re: Running an jython import job

Reply via email to