[PR] limits tablets and offers bulk import as option for ingest [accumulo-testing]

via GitHub Sun, 17 Nov 2024 18:56:17 -0800


keith-turner opened a new pull request, #287:
URL: https://github.com/apache/accumulo-testing/pull/287


   Two new continuous ingest features are introduced in this change. First 
options were added to limit the number of tablets written.  Second an options 
to use bulk ingest was added instead of a batch writer.
   
   These features support running a test like the following.
   
    * create a continuous ingest table with 1000 tablets
    * start 100 continuous ingest clients
    * have each client continually bulk import data to 10 random tablets
   
   This test situation will create a lot of bulk import and subsequent 
compaction activity for Accumulo to handle.
   
   These changes add bulk import to the `cingest ingest` command.  There is an 
existing `cingest bulk` command that runs a map reduce job to create bulk 
files.  These changes do not remove the need for the existing map reduce job, 
they fill a different purpose.  The map reduce job can generate really large 
amount of data to bulk import.  These changes allow generating lots of bulk 
imports w/ small amounts of data. These changes could never generate the amount 
of data for a single bulk import that the map reduce job could. The following 
is an example of test scenario that could use both.
   
    * create a continuous ingest table with 1000 tablets
    * use map reduce bulk job to create an initial 10 billion entries in the 
table
    * start 100 continuous ingest clients
    * have each client continually bulk import data to 10 random tablets
    * stop clients after 12 hours and verify data


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] limits tablets and offers bulk import as option for ingest [accumulo-testing]

Reply via email to