[jira] [Updated] (GORA-130) gora-accumulo caches tablet locations between map reduce jobs

Keith Turner (JIRA) Fri, 11 May 2012 07:47:18 -0700

     [ 
https://issues.apache.org/jira/browse/GORA-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Keith Turner updated GORA-130:
------------------------------

    Description: 
Enis added a new Loop program to goraci that continually runs Generation and 
Verification map reduce jobs.  So you have one process launching multiple map 
reduce jobs.  I was running this I noticed an issue.  After the first round of 
generation, the table had 16 tablets.  So verification ran with 16 mappers, one 
per tablet.  Then more data was inserted and the table split to 32 tablets.  
When verification ran again it started 16 mappers instead of 32.  Turns out the 
gora-accumulo store was using stale cached information about the table to 
create the input splits for the map reduce job.

This issues will not affect the simple usage pattern of a single java process 
launching one map reduce job that reads from accumulo.

  was:Enis added a new Loop program to goraci that continually runs Generation 
and Verification map reduce jobs.  So you have one process launching multiple 
map reduce jobs.  I was running this I noticed an issue.  After the first round 
of generation, the table had 16 tablets.  So verification ran with 16 mappers, 
one per tablet.  Then more data was inserted and the table split to 32 tablets. 
 When verification ran it started 16 mappers instead of 32.  Turns out the 
gora-accumulo store was using stale cached information about the table to 
create the input splits for the map reduce job.

    
> gora-accumulo caches tablet locations between map reduce jobs 
> --------------------------------------------------------------
>
>                 Key: GORA-130
>                 URL: https://issues.apache.org/jira/browse/GORA-130
>             Project: Apache Gora
>          Issue Type: Bug
>          Components: storage-accumulo
>    Affects Versions: 0.2
>            Reporter: Keith Turner
>             Fix For: 0.3
>
>
> Enis added a new Loop program to goraci that continually runs Generation and 
> Verification map reduce jobs.  So you have one process launching multiple map 
> reduce jobs.  I was running this I noticed an issue.  After the first round 
> of generation, the table had 16 tablets.  So verification ran with 16 
> mappers, one per tablet.  Then more data was inserted and the table split to 
> 32 tablets.  When verification ran again it started 16 mappers instead of 32. 
>  Turns out the gora-accumulo store was using stale cached information about 
> the table to create the input splits for the map reduce job.
> This issues will not affect the simple usage pattern of a single java process 
> launching one map reduce job that reads from accumulo.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (GORA-130) gora-accumulo caches tablet locations between map reduce jobs

Reply via email to