Hi all.
I am John Mora, a GSoC student that is working with the Apache Gora
Community in order to implement a Kudu DataStore for Gora.
Currently, I am having some issues with KuduScanner, so please could you
give some ideas of what I am doing wrong.
I am using kudu-client for java [1] and testing my code with
KuduTestHarness [2].
My code looks like this.
List<ColumnSchema> columns = new ArrayList<>();
columns.add(new ColumnSchema.ColumnSchemaBuilder("pkurl",
Type.STRING).key(true).build());
columns.add(new ColumnSchema.ColumnSchemaBuilder("content",
Type.BINARY).nullable(true).build());
columns.add(new ColumnSchema.ColumnSchemaBuilder("parsedContent",
Type.STRING).nullable(true).build());
List<String> keys = new ArrayList<>();
keys.add("pkurl");
Schema sch = new Schema(columns);
CreateTableOptions cto = new CreateTableOptions();
cto.setRangePartitionColumns(keys);
PartialRow lowerPar1 = sch.newPartialRow();
PartialRow upperPar1 = sch.newPartialRow();
upperPar1.addString("pkurl", "http://bar.com/");
cto.addRangePartition(lowerPar1, upperPar1);
PartialRow lowerPar2 = sch.newPartialRow();
PartialRow upperPar2 = sch.newPartialRow();
lowerPar2.addString("pkurl", "http://bar.com/");
cto.addRangePartition(lowerPar2, upperPar2);
table = client.createTable(kuduMapping.getTableName(), sch, cto);
// Insert some data using table.newInsert();
// {pkurl:"http://foo.com/1.html", content:[...], parsedContent:[..]}
// {pkurl:"http://baz.com/1.jsp&q=barbaz", content:[...],
parsedContent:[..]}
// {pkurl:"http://baz.com/1.jsp&q=barbaz&p=foo", content:[...],
parsedContent:[..]}
//Scanner
KuduScanner.KuduScannerBuilder scannerBuilder =
client.newScannerBuilder(table);
List<String> dbFields = new ArrayList<>();
dbFields.add("pkurl");
dbFields.add("content");
dbFields.add("parsedContent");
scannerBuilder.setProjectedColumnNames(dbFields);
KuduScanner build = scannerBuilder.build();
RowResultIterator resultIt = build.nextRows();
//Actual: RowResultIterator is Empty
//Expected: RowResultIterator has 3 entries.
I tested the same code with cto.addHashPartitions(keys, 2); instead of
addRangePartition.
And it works fine.
Why do I get an empty result when using addRangePartition? .
Cheers,
John
[1] https://kudu.apache.org/docs/developing.html#_maven_artifacts
[2]
https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing