[
https://issues.apache.org/jira/browse/PHOENIX-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sergey Soldatov updated PHOENIX-3521:
-------------------------------------
Attachment: gen.py
The python script to generate 1.csv
> Scan over local index may return incorrect result after flush & compaction
> --------------------------------------------------------------------------
>
> Key: PHOENIX-3521
> URL: https://issues.apache.org/jira/browse/PHOENIX-3521
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.7.0, 4.8.0
> Reporter: Sergey Soldatov
> Assignee: Sergey Soldatov
> Attachments: gen.py
>
>
> Following code can be used to reproduce:
> {noformat}
> @Test
> public void testit() throws Exception {
> Connection conn = DriverManager.getConnection("jdbc:phoenix:localhost");
> final PhoenixConnection phxConn = conn.unwrap(PhoenixConnection.class);
> while(true) {
> String t2 = "DROP TABLE IF EXISTS GIGANTIC_TABLE";
> conn.createStatement().execute(t2);
> t2 = "CREATE TABLE IF NOT EXISTS GIGANTIC_TABLE (ID INTEGER PRIMARY
> KEY, unsig_id UNSIGNED_INT, big_id BIGINT, unsig_long_id UNSIGNED_LONG,
> tiny_id TINYINT, unsig_tiny_id UNSIGNED_TINYINT, small_id SMALLINT,
> unsig_small_id UNSIGNED_SMALLINT, float_id FLOAT, unsig_float_id
> UNSIGNED_FLOAT, double_id DOUBLE, unsig_double_id UNSIGNED_DOUBLE, decimal_id
> DECIMAL, boolean_id BOOLEAN, time_id TIME, date_id DATE, timestamp_id
> TIMESTAMP, unsig_time_id TIME, unsig_date_id DATE, unsig_timestamp_id
> TIMESTAMP, varchar_id VARCHAR (30), char_id CHAR (30), binary_id VARCHAR
> (100), varbinary_id VARCHAR (100), array_id VARCHAR[])";
> conn.createStatement().execute(t2);
> CsvBulkLoadTool csvBulkLoadTool = new CsvBulkLoadTool();
> csvBulkLoadTool.setConf(new Configuration());
> int exitCode = csvBulkLoadTool.run(new String[]{
> "--input", "/tmp/1.csv",
> "--table", "GIGANTIC_TABLE",
> "-d", ",",
> "-a", ";",
> "-q", "\"\"\""
> });
> assertEquals(0, exitCode);
> for(int j = 0; j < 5; j++) {
> t2 = "DROP INDEX IF EXISTS LOCAL_INDEX_COLUMN_TYPE_char_id on
> GIGANTIC_TABLE";
> phxConn.createStatement().execute(t2);
> t2 = "CREATE LOCAL INDEX LOCAL_INDEX_COLUMN_TYPE_char_id ON
> GIGANTIC_TABLE (char_id)";
> phxConn.createStatement().execute(t2);
> String query = "SELECT count(*) FROM GIGANTIC_TABLE WHERE char_id
> like '%a%'";
> for (int i = 0; i < 5 ; i++) {
> ResultSet rs = phxConn.createStatement().executeQuery(query);
> while (rs.next()) {
> int result = rs.getInt(1);
> if (result == 500000) {
> LOG.error("OK");
> } else {
> LOG.error("FAILURE!!!!");
> }
> }
> }
> }
> }
> }
> {noformat}
> The issue is quite hard to reproduce and sometimes it requires several/many
> hours to get it. When it fails the result set contains value higher than
> 500k.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)