[jira] [Updated] (PHOENIX-3521) Scan over local index may return incorrect result after flush & compaction

Sergey Soldatov (JIRA) Tue, 06 Dec 2016 15:26:15 -0800

     [ 
https://issues.apache.org/jira/browse/PHOENIX-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sergey Soldatov updated PHOENIX-3521:
-------------------------------------
    Attachment: gen.py

The python script to generate 1.csv 

> Scan over local index may return incorrect result after flush & compaction
> --------------------------------------------------------------------------
>
>                 Key: PHOENIX-3521
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3521
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.7.0, 4.8.0
>            Reporter: Sergey Soldatov
>            Assignee: Sergey Soldatov
>         Attachments: gen.py
>
>
> Following code can be used to reproduce:
> {noformat}
> @Test
>   public void testit() throws Exception {
>     Connection conn = DriverManager.getConnection("jdbc:phoenix:localhost");
>     final PhoenixConnection phxConn = conn.unwrap(PhoenixConnection.class);
>     while(true) {
>       String t2 = "DROP TABLE IF EXISTS GIGANTIC_TABLE";
>       conn.createStatement().execute(t2);
>       t2 = "CREATE TABLE IF NOT EXISTS GIGANTIC_TABLE (ID INTEGER PRIMARY 
> KEY, unsig_id UNSIGNED_INT, big_id BIGINT, unsig_long_id UNSIGNED_LONG, 
> tiny_id TINYINT, unsig_tiny_id UNSIGNED_TINYINT, small_id SMALLINT, 
> unsig_small_id UNSIGNED_SMALLINT, float_id FLOAT, unsig_float_id 
> UNSIGNED_FLOAT, double_id DOUBLE, unsig_double_id UNSIGNED_DOUBLE, decimal_id 
> DECIMAL, boolean_id BOOLEAN, time_id TIME, date_id DATE, timestamp_id 
> TIMESTAMP, unsig_time_id TIME, unsig_date_id DATE, unsig_timestamp_id 
> TIMESTAMP, varchar_id VARCHAR (30), char_id CHAR (30), binary_id VARCHAR 
> (100), varbinary_id VARCHAR (100), array_id VARCHAR[])";
>       conn.createStatement().execute(t2);
>       CsvBulkLoadTool csvBulkLoadTool = new CsvBulkLoadTool();
>       csvBulkLoadTool.setConf(new Configuration());
>       int exitCode = csvBulkLoadTool.run(new String[]{
>               "--input", "/tmp/1.csv",
>               "--table", "GIGANTIC_TABLE",
>               "-d", ",",
>               "-a", ";",
>               "-q", "\"\"\""
>       });
>       assertEquals(0, exitCode);
>         for(int j = 0; j < 5; j++) {
>             t2 = "DROP INDEX IF EXISTS LOCAL_INDEX_COLUMN_TYPE_char_id on 
> GIGANTIC_TABLE";
>             phxConn.createStatement().execute(t2);
>             t2 = "CREATE LOCAL INDEX LOCAL_INDEX_COLUMN_TYPE_char_id ON 
> GIGANTIC_TABLE (char_id)";
>             phxConn.createStatement().execute(t2);
>             String query = "SELECT count(*) FROM GIGANTIC_TABLE WHERE char_id 
> like '%a%'";
>             for (int i = 0; i < 5 ; i++) {
>                 ResultSet rs = phxConn.createStatement().executeQuery(query);
>                 while (rs.next()) {
>                     int result = rs.getInt(1);
>                     if (result == 500000) {
>                         LOG.error("OK");
>                     } else {
>                         LOG.error("FAILURE!!!!");
>                     }
>                 }
>             }
>         }
>     }
>   }
> {noformat}
> The issue is quite hard to reproduce and sometimes it requires several/many 
> hours to get it.  When it fails the result set contains value higher than 
> 500k. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PHOENIX-3521) Scan over local index may return incorrect result after flush & compaction

Reply via email to