[
https://issues.apache.org/jira/browse/PHOENIX-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16615215#comment-16615215
]
Swaroopa Kadam commented on PHOENIX-4872:
-----------------------------------------
I am using below test case to reproduce the issue.
And the test passes just fine. Could you please tell me if anything is missing
from the test, [~mini666] ?
{code:java}
// code placeholder
@Test
public void testImportInSingleCellArrayWithOffsetsTable() throws Exception {
String tableName = generateUniqueName();
Statement stmt = conn.createStatement();
stmt.execute("CREATE IMMUTABLE TABLE S.TABLE12 (ID INTEGER NOT NULL PRIMARY
KEY, NAME VARCHAR, T DATE, CF1.T2 DATE, CF2.T3 DATE)" +
" IMMUTABLE_STORAGE_SCHEME=SINGLE_CELL_ARRAY_WITH_OFFSETS");
PhoenixConnection phxConn = conn.unwrap(PhoenixConnection.class);
PTable table = phxConn.getTable(new PTableKey(null, "S.TABLE12"));
assertEquals(PTable.ImmutableStorageScheme.SINGLE_CELL_ARRAY_WITH_OFFSETS,
table.getImmutableStorageScheme());
FileSystem fs = FileSystem.get(getUtility().getConfiguration());
FSDataOutputStream outputStream = fs.create(new Path("/tmp/inputSCAWO.csv"));
PrintWriter printWriter = new PrintWriter(outputStream);
printWriter.println("1,Name 1,1970/01/01,1970/02/01,1970/03/01");
printWriter.println("2,Name 2,1970/01/02,1970/02/02,1970/03/02");
printWriter.println("1,Name 1,1970/01/01,1970/02/03,1970/03/01");
printWriter.println("2,Name 2,1970/01/02,1970/02/04,1970/03/02");
printWriter.println("1,Name 1,1970/01/01,1970/02/05,1970/03/01");
printWriter.println("2,Name 2,1970/01/02,1970/02/06,1970/03/02");
printWriter.println("1,Name 1,1970/01/01,1970/02/07,1970/03/01");
printWriter.println("2,Name 2,1970/01/02,1970/02/08,1970/03/02");
printWriter.close();
CsvBulkLoadTool csvBulkLoadTool = new CsvBulkLoadTool();
csvBulkLoadTool.setConf(new Configuration(getUtility().getConfiguration()));
csvBulkLoadTool.getConf().set(DATE_FORMAT_ATTRIB,"yyyy/MM/dd");
int exitCode = csvBulkLoadTool.run(new String[] {
"--input", "/tmp/inputSCAWO.csv",
"--table", "table12",
"--schema", "s",
"--zookeeper", zkQuorum});
assertEquals(0, exitCode);
ResultSet rs = stmt.executeQuery("SELECT name, max(CF1.T2) FROM s.table12 GROUP
BY name");
assertTrue(rs.next());
assertEquals("Name 1", rs.getString(1));
assertEquals(DateUtil.parseDate("1970-02-07"), rs.getDate(2));
assertTrue(rs.next());
assertEquals("Name 2", rs.getString(1));
assertEquals(DateUtil.parseDate("1970-02-08"), rs.getDate(2));
assertFalse(rs.next());
rs.close();
stmt.close();
}
{code}
> BulkLoad has bug when loading on single-cell-array-with-offsets table.
> ----------------------------------------------------------------------
>
> Key: PHOENIX-4872
> URL: https://issues.apache.org/jira/browse/PHOENIX-4872
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.11.0, 4.12.0, 4.13.0, 4.14.0
> Reporter: JeongMin Ju
> Assignee: Swaroopa Kadam
> Priority: Critical
>
> CsvBulkLoadTool creates incorrect data for the
> SCAWO(SingleCellArrayWithOffsets) table.
> Every phoenix table needs a marker (empty) column, but CsvBulkLoadTool does
> not create that column for SCAWO tables.
> If you check the data through HBase Shell, you can see that there is no
> corresponding column.
> If created by Upsert Query, it is created normally.
> {code:java}
> column=0:\x00\x00\x00\x00, timestamp=1535420036372, value=x
> {code}
> Since there is no upper column, the result of all Group By queries is zero.
> This is because "families":
> {"0": ["\\ x00 \\ x00 \\ x00 \\ x00"]}
> is added to the column of the Scan object.
> Because the CsvBulkLoadTool has not created the column, the result of the
> scan is empty.
>
> This problem applies only to tables with multiple column families. The
> single-column family table works luckily.
> "Families": \{"0": ["ALL"]} is added to the column of the Scan object in the
> single column family table.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)