[ 
https://issues.apache.org/jira/browse/ACCUMULO-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner reopened ACCUMULO-4165:
------------------------------------

> Create a user level API for RFile
> ---------------------------------
>
>                 Key: ACCUMULO-4165
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4165
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>            Priority: Blocker
>             Fix For: 1.8.0
>
>          Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Users can bulk import RFiles.  Currently the only way users can create RFiles 
> using Accumulo's public API is via AccumuloFileOutputFormat.  There is no way 
> to read RFiles in the public API.   Also, the internal APIs for reading and 
> writing RFiles are cumbersome to use.
> I am experimenting with a simple RFile API like the following.  Below is an 
> example of writing data.
> {code:java}
>     LocalFileSystem localFs = FileSystem.getLocal(new Configuration());
>     RFileWriter writer = RFileFactory.newWriter()
>                                        .withFileName("/tmp/test100M.rf")
>                                        .withFileSystem(localFs).build();
>     writer.startDefaultLocalityGroup();
>     for (int r = 0; r < 10000000; r++) {
>       for (int cq = 0; cq < 10; cq++) {
>         writer.append(genKey(r, cq), genVal(r, cq));
>       }
>     }
>     writer.close();
> {code}
> Below is an example of reading data.
> {code:java}
>     LocalFileSystem localFs = FileSystem.getLocal(new Configuration());
>     Scanner scanner = RFileFactory.newScanner()
>                                           .withFileName("/tmp/test100M.rf")
>                                           .withFileSystem(localFs)
>                                           .withDataCache(250000000)
>                                           .withIndexCache(1000000).build();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to