Alter Table throws java.lang.NullPointerException

2019-07-23 Thread Alexander Batyrshin
Hello all,

Got this:

alter table TEST_TABLE SET APPEND_ONLY_SCHEMA=true;
java.lang.NullPointerException
at 
org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3240)
at 
org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3221)
at 
org.apache.phoenix.jdbc.PhoenixStatement$ExecutableAddColumnStatement$1.execute(PhoenixStatement.java:1432)
at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:408)
at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:391)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:390)
at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)
at 
org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1825)
at sqlline.Commands.execute(Commands.java:822)
at sqlline.Commands.sql(Commands.java:732)
at sqlline.SqlLine.dispatch(SqlLine.java:813)
at sqlline.SqlLine.begin(SqlLine.java:686)
at sqlline.SqlLine.start(SqlLine.java:398)
at sqlline.SqlLine.main(SqlLine.java:291)


Any ideas how to fix?

Inconsistent Results on Table Join when Index and TIMESTAMP type involved

2019-07-23 Thread Jack Steenkamp
Hi All,

I have a use case where I need to join a table with an index on itself in
order to find the latest entry. In doing so, I've come a case in Phoenix
(4.14.1) where I appear to get incorrect results if I do a join with
TIMESTAMPs where indexes are involved.

I have a suspicion that this is to down to the fact that TIMESTAMPs are
stored differently on index tables (at least based on what I see in the
result set meta data it is BigDecimal) and somehow conversion is not done
properly before joining. However, I thought I would reach out to check with
those that have more expertise in this area before I continue.

Attached, you can find a standalone program that you can run which
reproduces this issue. Here are the basic steps:

1) Create a table which effectively has 3 representations of the value - as
a TIMESTAMP, a VARCHAR and as a CHAR(29). Create indexes for each of these
representations:

CREATE TABLE IF NOT EXISTS TIME_TEST
(
   ID VARCHARNOT NULL,
   ENTRY_NAMEVARCHAR,
   TRANSACTION   TIMESTAMP  ,
   TRANSACTION_STR   VARCHAR,
   TRANSACTION_CHR   CHAR(29)
   CONSTRAINT pk PRIMARY KEY(ID)
)
IMMUTABLE_STORAGE_SCHEME=ONE_CELL_PER_COLUMN,
UPDATE_CACHE_FREQUENCY=90,
COLUMN_ENCODED_BYTES=NONE,
IMMUTABLE_ROWS=true

CREATE LOCAL INDEX TIME_TEST_I01 ON TIME_TEST(ENTRY_NAME, TRANSACTION DESC)

CREATE LOCAL INDEX TIME_TEST_I02 ON TIME_TEST(ENTRY_NAME, TRANSACTION_STR
DESC)

CREATE LOCAL INDEX TIME_TEST_I03 ON TIME_TEST(ENTRY_NAME, TRANSACTION_CHR
DESC)

2) Fill up the table with some values, in the different representations. In
the attached test-case I do 100 entries. Such that you have something like :

IDENTRY_NAME   TRANSACTION   TRANSACTION_STR
TRANSACTION_CHR
-
001   001  1970-01-01 01:00:00.1 1970-01-01
00:00:00.1 1970-01-01 00:00:00.1
0010001   0010001  1970-01-01 01:00:00.10001 1970-01-01
00:00:00.10001 1970-01-01 00:00:00.10001
0010002   0010002  1970-01-01 01:00:00.10002 1970-01-01
00:00:00.10002 1970-01-01 00:00:00.10002
0010003   0010003  1970-01-01 01:00:00.10003 1970-01-01
00:00:00.10003 1970-01-01 00:00:00.10003

3) Essentially do the same MAX / Join query using each of the different
values (e.g. TIMESTAMP vs. VARCHAR vs. CHAR(29)). They should all return
the same number of rows (e.g. the total number of rows on the table). Only
the one based on VARCHAR does:

SELECT TIME_TEST.ENTRY_NAME FROM TIME_TEST
 INNER JOIN
(
SELECT ENTRY_NAME, MAX(TRANSACTION_STR) AS TRANSACTION_STR
FROM TIME_TEST where ENTRY_NAME like '0%' GROUP BY ENTRY_NAME
) sub
ON TIME_TEST.ENTRY_NAME = sub.ENTRY_NAME AND TIME_TEST.TRANSACTION_STR =
sub.TRANSACTION_STR
WHERE TIME_TEST.ENTRY_NAME like '0%'

*TOTAL Rows : 100*

SELECT TIME_TEST.ENTRY_NAME FROM TIME_TEST
 INNER JOIN
(
SELECT ENTRY_NAME, MAX(TRANSACTION) AS TRANSACTION
FROM TIME_TEST where ENTRY_NAME like '0%' GROUP BY ENTRY_NAME
) sub
ON TIME_TEST.ENTRY_NAME = sub.ENTRY_NAME AND TIME_TEST.TRANSACTION =
sub.TRANSACTION
WHERE TIME_TEST.ENTRY_NAME like '0%'

*TOTAL Rows : 1*

SELECT TIME_TEST.ENTRY_NAME FROM TIME_TEST
 INNER JOIN
(
SELECT ENTRY_NAME, MAX(TRANSACTION_CHR) AS TRANSACTION_CHR
FROM TIME_TEST where ENTRY_NAME like '0%' GROUP BY ENTRY_NAME
) sub
ON TIME_TEST.ENTRY_NAME = sub.ENTRY_NAME AND TIME_TEST.TRANSACTION_CHR=
sub.TRANSACTION_CHR
WHERE TIME_TEST.ENTRY_NAME like '0%'

*TOTAL Rows : 0*

Interestingly enough - in the final query, if I change the inner join
expression to be 'CAST( MAX(TRANSACTION_CHR) AS VARCHAR) AS TRANSACTION_CHR'
it works as expected. Hence, leading me to believe this is some sort of
column type conversion issue.

Either way, would appreciate it if anyone could give me any pointers on the
above.

Thanks,
package com.jsteenkamp.phoenix;

import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.Statement;
import java.sql.Timestamp;
import java.text.DecimalFormat;
import java.time.Instant;
import java.time.format.DateTimeFormatter;
import java.util.TimeZone;


public class PhoenixTimestampEncodingTest
{
	public static final String JDBC_URL = "jdbc:phoenix:localhost:63214";
	private static final DateTimeFormatter FORMATTER = DateTimeFormatter.ofPattern("-MM-dd HH:mm:ss.n");
	
	public static final int ITEMS_TO_INSERT = 100;
	
	public static final DecimalFormat decf = new DecimalFormat("000"); 
	public static final long STARTING_NANO = 1L;
	static
	{
		TimeZone.setDefault(TimeZone.getTimeZone("UTC"));
	}
	
	public static void main(String... args) throws Exception
	{
		final java.sql.Connection conn = DriverManager.getConnection(JDBC_URL, "sa", "");
		final Statement stmt = conn.createStatement();
		
		stmt.execute("DROP TABLE IF EXISTS TIME_TEST ");
		
		String