[
https://issues.apache.org/jira/browse/GORA-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Keith Turner updated GORA-130:
------------------------------
Description:
Enis added a new Loop program to goraci that continually runs Generation and
Verification map reduce jobs. So you have one process launching multiple map
reduce jobs. I was running this I noticed an issue. After the first round of
generation, the table had 16 tablets. So verification ran with 16 mappers, one
per tablet. Then more data was inserted and the table split to 32 tablets.
When verification ran again it started 16 mappers instead of 32. Turns out the
gora-accumulo store was using stale cached information about the table to
create the input splits for the map reduce job.
This issues will not affect the simple usage pattern of a single java process
launching one map reduce job that reads from accumulo.
was:Enis added a new Loop program to goraci that continually runs Generation
and Verification map reduce jobs. So you have one process launching multiple
map reduce jobs. I was running this I noticed an issue. After the first round
of generation, the table had 16 tablets. So verification ran with 16 mappers,
one per tablet. Then more data was inserted and the table split to 32 tablets.
When verification ran it started 16 mappers instead of 32. Turns out the
gora-accumulo store was using stale cached information about the table to
create the input splits for the map reduce job.
> gora-accumulo caches tablet locations between map reduce jobs
> --------------------------------------------------------------
>
> Key: GORA-130
> URL: https://issues.apache.org/jira/browse/GORA-130
> Project: Apache Gora
> Issue Type: Bug
> Components: storage-accumulo
> Affects Versions: 0.2
> Reporter: Keith Turner
> Fix For: 0.3
>
>
> Enis added a new Loop program to goraci that continually runs Generation and
> Verification map reduce jobs. So you have one process launching multiple map
> reduce jobs. I was running this I noticed an issue. After the first round
> of generation, the table had 16 tablets. So verification ran with 16
> mappers, one per tablet. Then more data was inserted and the table split to
> 32 tablets. When verification ran again it started 16 mappers instead of 32.
> Turns out the gora-accumulo store was using stale cached information about
> the table to create the input splits for the map reduce job.
> This issues will not affect the simple usage pattern of a single java process
> launching one map reduce job that reads from accumulo.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira