user@accumulo,
I was working with the Wikipedia Accumulo ingest examples, and I was trying to get the ingest of a single archive file to be as fast as ingesting multiple archives through parallelization. I increased the number of ways the job split the single archive so that all the servers could work on ingesting at the same time. What I noticed, however, was that having all the servers work on ingesting the same file was still not nearly as fast as using multiple ingest files. I was wondering if I could have some insight into the design of the Wikipedia ingest that could explain this phenomenon. Thank you for your time, Patrick Lynch
