[jira] [Commented] (PHOENIX-4110) ParallelRunListener should monitor number of tables and not number of tests

2017-08-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137828#comment-16137828
 ] 

Hadoop QA commented on PHOENIX-4110:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12883238/PHOENIX-4110.patch
  against master branch at commit fd893ef47be0780ab8b7c5991426092fd504b322.
  ATTACHMENT ID: 12883238

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
56 warning messages.

{color:red}-1 release audit{color}.  The applied patch generated 3 release 
audit warnings (more than the master's current 0 warnings).

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1287//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1287//artifact/patchprocess/patchReleaseAuditWarnings.txt
Javadoc warnings: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1287//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1287//console

This message is automatically generated.

> ParallelRunListener should monitor number of tables and not number of tests
> ---
>
> Key: PHOENIX-4110
> URL: https://issues.apache.org/jira/browse/PHOENIX-4110
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-4110.patch
>
>
> ParallelRunListener today monitors the number of tests that have been run to 
> determine when mini cluster should be shut down. This helps prevent our test 
> JVM forks running in OOM. A better heuristic would be to instead check the 
> number of tables that were created by tests. This way when a particular test 
> class has created lots of tables, we can shut down the mini cluster sooner.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4115) Global indexes of replicated, immutable tables should be replicated

2017-08-22 Thread Geoffrey Jacoby (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137814#comment-16137814
 ] 

Geoffrey Jacoby commented on PHOENIX-4115:
--

Thinking about this some more...there's something else that makes it more 
complicated. Replication will stall if you have edits to a replicated table on 
one side where the table doesn't exist on the remote side. 

This means the ideal order would be:
1. Create index on DR
2. Create index on primary
3. Turn replication on for DR (if master-master replication is desired)
4. Turn replication on for primary
5. Populate the index
6. Activate the index

Which seems quite difficult for Phoenix to orchestrate. Yet it's a lot to ask 
operators to get right (speaking as an operator who's gotten it wrong. :-)  )

> Global indexes of replicated, immutable tables should be replicated
> ---
>
> Key: PHOENIX-4115
> URL: https://issues.apache.org/jira/browse/PHOENIX-4115
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
>Reporter: Geoffrey Jacoby
> Fix For: 4.12.0
>
>
> Global indexes are stored in their own standalone tables, and for indexes on 
> immutable tables, they're populated purely by client-side logic and don't go 
> through the indexing coprocessors. 
> This means that if a global index is created on an immutable table that's 
> replicated to a different HBase cluster (say for DR), the index edits won't 
> also be replicated to the remote cluster, because the server-side indexing 
> logic won't fire when the base table edits are processed on either side. 
> Indexes aren't created with a replication scope, and so HBase defaults to 
> "don't replicate"
> Easiest fix for this is to set REPLICATION_SCOPE=1 on the index table when 
> creating a global index on an immutable table that has REPLICATION_SCOPE=1. 
> Interesting questions for potential followup JIRAs:
> 1. Should Phoenix automatically update existing immutable indexes on upgrade 
> that are suffering from this problem, or just a release note to operators 
> explaining the necessary fix?
> 2. Should Phoenix honor replication filters on an indexed column family or 
> column in the data table on the index side? (Since these can change over 
> time, that would get complicated very quickly.)
> Thanks, [~mujtabachohan] for pointing out and verifying this problem!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4114) Derive ConcurrentMutationsIT from ParallelStatsDisabledIT

2017-08-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137795#comment-16137795
 ] 

Hudson commented on PHOENIX-4114:
-

FAILURE: Integrated in Jenkins build Phoenix-master #1747 (See 
[https://builds.apache.org/job/Phoenix-master/1747/])
PHOENIX-4114 Derive ConcurrentMutationsIT from ParallelStatsDisabledIT 
(jamestaylor: rev fd893ef47be0780ab8b7c5991426092fd504b322)
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/ConcurrentMutationsIT.java


> Derive ConcurrentMutationsIT from ParallelStatsDisabledIT
> -
>
> Key: PHOENIX-4114
> URL: https://issues.apache.org/jira/browse/PHOENIX-4114
> Project: Phoenix
>  Issue Type: Test
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.12.0
>
> Attachments: PHOENIX-4114.patch
>
>
> No need to spin up a new mini cluster for that test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4115) Global indexes of replicated, immutable tables should be replicated

2017-08-22 Thread Geoffrey Jacoby (JIRA)
Geoffrey Jacoby created PHOENIX-4115:


 Summary: Global indexes of replicated, immutable tables should be 
replicated
 Key: PHOENIX-4115
 URL: https://issues.apache.org/jira/browse/PHOENIX-4115
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.11.0
Reporter: Geoffrey Jacoby
 Fix For: 4.12.0


Global indexes are stored in their own standalone tables, and for indexes on 
immutable tables, they're populated purely by client-side logic and don't go 
through the indexing coprocessors. 

This means that if a global index is created on an immutable table that's 
replicated to a different HBase cluster (say for DR), the index edits won't 
also be replicated to the remote cluster, because the server-side indexing 
logic won't fire when the base table edits are processed on either side. 
Indexes aren't created with a replication scope, and so HBase defaults to 
"don't replicate"

Easiest fix for this is to set REPLICATION_SCOPE=1 on the index table when 
creating a global index on an immutable table that has REPLICATION_SCOPE=1. 

Interesting questions for potential followup JIRAs:
1. Should Phoenix automatically update existing immutable indexes on upgrade 
that are suffering from this problem, or just a release note to operators 
explaining the necessary fix?
2. Should Phoenix honor replication filters on an indexed column family or 
column in the data table on the index side? (Since these can change over time, 
that would get complicated very quickly.)

Thanks, [~mujtabachohan] for pointing out and verifying this problem!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4110) ParallelRunListener should monitor number of tables and not number of tests

2017-08-22 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4110:
--
Attachment: PHOENIX-4110.patch

> ParallelRunListener should monitor number of tables and not number of tests
> ---
>
> Key: PHOENIX-4110
> URL: https://issues.apache.org/jira/browse/PHOENIX-4110
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-4110.patch
>
>
> ParallelRunListener today monitors the number of tests that have been run to 
> determine when mini cluster should be shut down. This helps prevent our test 
> JVM forks running in OOM. A better heuristic would be to instead check the 
> number of tables that were created by tests. This way when a particular test 
> class has created lots of tables, we can shut down the mini cluster sooner.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4113) Killing forked JVM may cause resources to be not released

2017-08-22 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137733#comment-16137733
 ] 

Samarth Jain commented on PHOENIX-4113:
---

[~jamestaylor] - I was hoping the QA run would help me validate if this change 
is going to help. Unfortunately it doesn't seem to be able to apply this patch 
on the 0.98 branch even though I did generate it with the latest code. I am 
running locally to verify if this helps the 0.98 build to pass. I have seen 
that build consistently fail with the below error:

{code}
Caused by: org.apache.maven.surefire.booter.SurefireBooterForkException: The 
forked VM terminated without properly saying goodbye. VM crash or System.exit 
called?

[ERROR] Process Exit Code: 0
[ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:679)
[ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:533)
[ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.access$600(ForkStarter.java:117)
[ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter$2.call(ForkStarter.java:429)
[ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter$2.call(ForkStarter.java:406)
[ERROR] at java.util.concurrent.FutureTask.run(FutureTask.java:262)
[ERROR] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[ERROR] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[ERROR] at java.lang.Thread.run(Thread.java:745)
{code}



> Killing forked JVM may cause resources to be not released
> -
>
> Key: PHOENIX-4113
> URL: https://issues.apache.org/jira/browse/PHOENIX-4113
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-4113_4.x-HBase-0.98_v2.patch
>
>
> We have a kill configured in pom which behind the scenes 
> calls 
> {code}
> java.lang.Runtime.halt(1)
> {code}
> We also have a shutdown hook which is calling halt on the JVM.
> {code}
> private static String checkClusterInitialized(ReadOnlyProps serverProps) 
> throws Exception {
> if (!clusterInitialized) {
> url = setUpTestCluster(config, serverProps);
> clusterInitialized = true;
> Runtime.getRuntime().addShutdownHook(new Thread() {
> @Override
> public void run() {
> logger.info("SHUTDOWN: halting JVM now");
> Runtime.getRuntime().halt(0);
> }
> });
> }
> return url;
> }
> {code}
> This causes JVM to not execute all shutdown hooks which in turn would cause 
> the JVM process to not release all the system resources (network ports, file 
> handles, etc) it was using. If OS is not able to clean up these orphaned 
> resources soon enough, it could cause subsequent new JVM processes to run 
> into resource issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (PHOENIX-4114) Derive ConcurrentMutationsIT from ParallelStatsDisabledIT

2017-08-22 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor resolved PHOENIX-4114.
---
Resolution: Fixed

> Derive ConcurrentMutationsIT from ParallelStatsDisabledIT
> -
>
> Key: PHOENIX-4114
> URL: https://issues.apache.org/jira/browse/PHOENIX-4114
> Project: Phoenix
>  Issue Type: Test
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.12.0
>
> Attachments: PHOENIX-4114.patch
>
>
> No need to spin up a new mini cluster for that test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4114) Derive ConcurrentMutationsIT from ParallelStatsDisabledIT

2017-08-22 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4114:
--
Fix Version/s: 4.12.0

> Derive ConcurrentMutationsIT from ParallelStatsDisabledIT
> -
>
> Key: PHOENIX-4114
> URL: https://issues.apache.org/jira/browse/PHOENIX-4114
> Project: Phoenix
>  Issue Type: Test
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.12.0
>
> Attachments: PHOENIX-4114.patch
>
>
> No need to spin up a new mini cluster for that test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4114) Derive ConcurrentMutationsIT from ParallelStatsDisabledIT

2017-08-22 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4114:
--
Issue Type: Test  (was: Bug)

> Derive ConcurrentMutationsIT from ParallelStatsDisabledIT
> -
>
> Key: PHOENIX-4114
> URL: https://issues.apache.org/jira/browse/PHOENIX-4114
> Project: Phoenix
>  Issue Type: Test
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.12.0
>
> Attachments: PHOENIX-4114.patch
>
>
> No need to spin up a new mini cluster for that test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-2460) Implement scrutiny command to validate whether or not an index is in sync with the data table

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137666#comment-16137666
 ] 

ASF GitHub Bot commented on PHOENIX-2460:
-

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/269#discussion_r134624315
  
--- Diff: 
phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexScrutinyMapper.java
 ---
@@ -0,0 +1,349 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.mapreduce.index;
+
+import java.io.IOException;
+import java.sql.Connection;
+import java.sql.PreparedStatement;
+import java.sql.ResultSet;
+import java.sql.SQLException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.Properties;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapreduce.Mapper;
+import org.apache.phoenix.mapreduce.PhoenixJobCounters;
+import org.apache.phoenix.mapreduce.index.IndexScrutinyTool.OutputFormat;
+import org.apache.phoenix.mapreduce.index.IndexScrutinyTool.SourceTable;
+import org.apache.phoenix.mapreduce.util.ConnectionUtil;
+import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
+import org.apache.phoenix.parse.HintNode.Hint;
+import org.apache.phoenix.schema.PTable;
+import org.apache.phoenix.util.ColumnInfo;
+import org.apache.phoenix.util.PhoenixRuntime;
+import org.apache.phoenix.util.QueryUtil;
+import org.apache.phoenix.util.SchemaUtil;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.base.Joiner;
+
+/**
+ * Mapper that reads from the data table and checks the rows against the 
index table
+ */
+public class IndexScrutinyMapper extends Mapper {
+
+private static final Logger LOG = 
LoggerFactory.getLogger(IndexScrutinyMapper.class);
+private Connection connection;
+private List targetTblColumnMetadata;
+private long batchSize;
+// holds a batch of rows from the table the mapper is iterating over
+private List currentBatchValues = new ArrayList<>();
+private String targetTableQuery;
+private int numTargetPkCols;
+private boolean outputInvalidRows;
+private OutputFormat outputFormat = OutputFormat.FILE;
+private String qSourceTable;
+private String qTargetTable;
+private long executeTimestamp;
+private int numSourcePkCols;
+private final PhoenixIndexDBWritable indxWritable = new 
PhoenixIndexDBWritable();
+private List sourceTblColumnMetadata;
+
+// used to write results to the output table
+private Connection outputConn;
+private PreparedStatement outputUpsertStmt;
+private long outputMaxRows;
+
+@Override
+protected void setup(final Context context) throws IOException, 
InterruptedException {
+super.setup(context);
+final Configuration configuration = context.getConfiguration();
+try {
+// get a connection with correct CURRENT_SCN (so incoming 
writes don't throw off the
+// scrutiny)
+final Properties overrideProps = new Properties();
+String scn = 
configuration.get(PhoenixConfigurationUtil.CURRENT_SCN_VALUE);
+overrideProps.put(PhoenixRuntime.CURRENT_SCN_ATTRIB, scn);
+connection = ConnectionUtil.getOutputConnection(configuration, 
overrideProps);
+connection.setAutoCommit(false);
+batchSize = 
PhoenixConfigurationUtil.getScrutinyBatchSize(configuration);
+outputInvalidRows =
+

[jira] [Commented] (PHOENIX-2460) Implement scrutiny command to validate whether or not an index is in sync with the data table

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137664#comment-16137664
 ] 

ASF GitHub Bot commented on PHOENIX-2460:
-

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/269#discussion_r134631925
  
--- Diff: 
phoenix-core/src/test/java/org/apache/phoenix/mapreduce/index/TestIndexScrutinyTableOutput.java
 ---
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more 
contributor license
+ * agreements. See the NOTICE file distributed with this work for 
additional information regarding
+ * copyright ownership. The ASF licenses this file to you under the Apache 
License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the 
License. You may obtain a
+ * copy of the License at http://www.apache.org/licenses/LICENSE-2.0 
Unless required by applicable
+ * law or agreed to in writing, software distributed under the License is 
distributed on an "AS IS"
+ * BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied. See the License
+ * for the specific language governing permissions and limitations under 
the License.
+ */
+package org.apache.phoenix.mapreduce.index;
+
+import static org.junit.Assert.assertEquals;
+
+import java.sql.SQLException;
+import java.util.Arrays;
+
+import org.apache.phoenix.mapreduce.util.IndexColumnNames;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestIndexScrutinyTableOutput extends BaseIndexTest {
+
+private static final long SCRUTINY_TIME_MILLIS = 1502908914193L;
+
+@Before
+public void setup() throws Exception {
+super.setup();
+
conn.createStatement().execute(IndexScrutinyTableOutput.OUTPUT_TABLE_DDL);
+
conn.createStatement().execute(IndexScrutinyTableOutput.OUTPUT_METADATA_DDL);
+conn.commit();
--- End diff --

Minor nit: commit() not necessary after DDL


> Implement scrutiny command to validate whether or not an index is in sync 
> with the data table
> -
>
> Key: PHOENIX-2460
> URL: https://issues.apache.org/jira/browse/PHOENIX-2460
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
> Attachments: PHOENIX-2460.patch
>
>
> We should have a process that runs to verify that an index is valid against a 
> data table and potentially fixes it if discrepancies are found. This could 
> either be a MR job or a low priority background task.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-2460) Implement scrutiny command to validate whether or not an index is in sync with the data table

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137667#comment-16137667
 ] 

ASF GitHub Bot commented on PHOENIX-2460:
-

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/269#discussion_r134632114
  
--- Diff: 
phoenix-core/src/test/java/org/apache/phoenix/mapreduce/util/TestIndexColumnNames.java
 ---
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.mapreduce.util;
+
+import static org.junit.Assert.assertEquals;
+
+import java.sql.SQLException;
+
+import org.apache.phoenix.jdbc.PhoenixConnection;
+import org.apache.phoenix.mapreduce.index.BaseIndexTest;
+import org.apache.phoenix.parse.HintNode.Hint;
+import org.apache.phoenix.schema.PTableKey;
+import org.apache.phoenix.util.QueryUtil;
+import org.junit.Test;
+
+public class TestIndexColumnNames extends BaseIndexTest {
--- End diff --

Same here for name: IndexColumnNamesTest


> Implement scrutiny command to validate whether or not an index is in sync 
> with the data table
> -
>
> Key: PHOENIX-2460
> URL: https://issues.apache.org/jira/browse/PHOENIX-2460
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
> Attachments: PHOENIX-2460.patch
>
>
> We should have a process that runs to verify that an index is valid against a 
> data table and potentially fixes it if discrepancies are found. This could 
> either be a MR job or a low priority background task.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-2460) Implement scrutiny command to validate whether or not an index is in sync with the data table

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137665#comment-16137665
 ] 

ASF GitHub Bot commented on PHOENIX-2460:
-

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/269#discussion_r134632028
  
--- Diff: 
phoenix-core/src/test/java/org/apache/phoenix/mapreduce/index/TestIndexScrutinyTableOutput.java
 ---
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more 
contributor license
+ * agreements. See the NOTICE file distributed with this work for 
additional information regarding
+ * copyright ownership. The ASF licenses this file to you under the Apache 
License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the 
License. You may obtain a
+ * copy of the License at http://www.apache.org/licenses/LICENSE-2.0 
Unless required by applicable
+ * law or agreed to in writing, software distributed under the License is 
distributed on an "AS IS"
+ * BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied. See the License
+ * for the specific language governing permissions and limitations under 
the License.
+ */
+package org.apache.phoenix.mapreduce.index;
+
+import static org.junit.Assert.assertEquals;
+
+import java.sql.SQLException;
+import java.util.Arrays;
+
+import org.apache.phoenix.mapreduce.util.IndexColumnNames;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestIndexScrutinyTableOutput extends BaseIndexTest {
--- End diff --

Minor nit: Phoenix naming convention (with a few exceptions) is to post fix 
class with "Test": IndexScrutinyTableOutputTest


> Implement scrutiny command to validate whether or not an index is in sync 
> with the data table
> -
>
> Key: PHOENIX-2460
> URL: https://issues.apache.org/jira/browse/PHOENIX-2460
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
> Attachments: PHOENIX-2460.patch
>
>
> We should have a process that runs to verify that an index is valid against a 
> data table and potentially fixes it if discrepancies are found. This could 
> either be a MR job or a low priority background task.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] phoenix pull request #269: PHOENIX-2460 Implement scrutiny command to valida...

2017-08-22 Thread JamesRTaylor
Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/269#discussion_r134632114
  
--- Diff: 
phoenix-core/src/test/java/org/apache/phoenix/mapreduce/util/TestIndexColumnNames.java
 ---
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.mapreduce.util;
+
+import static org.junit.Assert.assertEquals;
+
+import java.sql.SQLException;
+
+import org.apache.phoenix.jdbc.PhoenixConnection;
+import org.apache.phoenix.mapreduce.index.BaseIndexTest;
+import org.apache.phoenix.parse.HintNode.Hint;
+import org.apache.phoenix.schema.PTableKey;
+import org.apache.phoenix.util.QueryUtil;
+import org.junit.Test;
+
+public class TestIndexColumnNames extends BaseIndexTest {
--- End diff --

Same here for name: IndexColumnNamesTest


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] phoenix pull request #269: PHOENIX-2460 Implement scrutiny command to valida...

2017-08-22 Thread JamesRTaylor
Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/269#discussion_r134624315
  
--- Diff: 
phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexScrutinyMapper.java
 ---
@@ -0,0 +1,349 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.mapreduce.index;
+
+import java.io.IOException;
+import java.sql.Connection;
+import java.sql.PreparedStatement;
+import java.sql.ResultSet;
+import java.sql.SQLException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.Properties;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapreduce.Mapper;
+import org.apache.phoenix.mapreduce.PhoenixJobCounters;
+import org.apache.phoenix.mapreduce.index.IndexScrutinyTool.OutputFormat;
+import org.apache.phoenix.mapreduce.index.IndexScrutinyTool.SourceTable;
+import org.apache.phoenix.mapreduce.util.ConnectionUtil;
+import org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil;
+import org.apache.phoenix.parse.HintNode.Hint;
+import org.apache.phoenix.schema.PTable;
+import org.apache.phoenix.util.ColumnInfo;
+import org.apache.phoenix.util.PhoenixRuntime;
+import org.apache.phoenix.util.QueryUtil;
+import org.apache.phoenix.util.SchemaUtil;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.base.Joiner;
+
+/**
+ * Mapper that reads from the data table and checks the rows against the 
index table
+ */
+public class IndexScrutinyMapper extends Mapper {
+
+private static final Logger LOG = 
LoggerFactory.getLogger(IndexScrutinyMapper.class);
+private Connection connection;
+private List targetTblColumnMetadata;
+private long batchSize;
+// holds a batch of rows from the table the mapper is iterating over
+private List currentBatchValues = new ArrayList<>();
+private String targetTableQuery;
+private int numTargetPkCols;
+private boolean outputInvalidRows;
+private OutputFormat outputFormat = OutputFormat.FILE;
+private String qSourceTable;
+private String qTargetTable;
+private long executeTimestamp;
+private int numSourcePkCols;
+private final PhoenixIndexDBWritable indxWritable = new 
PhoenixIndexDBWritable();
+private List sourceTblColumnMetadata;
+
+// used to write results to the output table
+private Connection outputConn;
+private PreparedStatement outputUpsertStmt;
+private long outputMaxRows;
+
+@Override
+protected void setup(final Context context) throws IOException, 
InterruptedException {
+super.setup(context);
+final Configuration configuration = context.getConfiguration();
+try {
+// get a connection with correct CURRENT_SCN (so incoming 
writes don't throw off the
+// scrutiny)
+final Properties overrideProps = new Properties();
+String scn = 
configuration.get(PhoenixConfigurationUtil.CURRENT_SCN_VALUE);
+overrideProps.put(PhoenixRuntime.CURRENT_SCN_ATTRIB, scn);
+connection = ConnectionUtil.getOutputConnection(configuration, 
overrideProps);
+connection.setAutoCommit(false);
+batchSize = 
PhoenixConfigurationUtil.getScrutinyBatchSize(configuration);
+outputInvalidRows =
+
PhoenixConfigurationUtil.getScrutinyOutputInvalidRows(configuration);
+outputFormat = 
PhoenixConfigurationUtil.getScrutinyOutputFormat(configuration);
+executeTimestamp = 

[GitHub] phoenix pull request #269: PHOENIX-2460 Implement scrutiny command to valida...

2017-08-22 Thread JamesRTaylor
Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/269#discussion_r134632028
  
--- Diff: 
phoenix-core/src/test/java/org/apache/phoenix/mapreduce/index/TestIndexScrutinyTableOutput.java
 ---
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more 
contributor license
+ * agreements. See the NOTICE file distributed with this work for 
additional information regarding
+ * copyright ownership. The ASF licenses this file to you under the Apache 
License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the 
License. You may obtain a
+ * copy of the License at http://www.apache.org/licenses/LICENSE-2.0 
Unless required by applicable
+ * law or agreed to in writing, software distributed under the License is 
distributed on an "AS IS"
+ * BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied. See the License
+ * for the specific language governing permissions and limitations under 
the License.
+ */
+package org.apache.phoenix.mapreduce.index;
+
+import static org.junit.Assert.assertEquals;
+
+import java.sql.SQLException;
+import java.util.Arrays;
+
+import org.apache.phoenix.mapreduce.util.IndexColumnNames;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestIndexScrutinyTableOutput extends BaseIndexTest {
--- End diff --

Minor nit: Phoenix naming convention (with a few exceptions) is to post fix 
class with "Test": IndexScrutinyTableOutputTest


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] phoenix pull request #269: PHOENIX-2460 Implement scrutiny command to valida...

2017-08-22 Thread JamesRTaylor
Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/269#discussion_r134631925
  
--- Diff: 
phoenix-core/src/test/java/org/apache/phoenix/mapreduce/index/TestIndexScrutinyTableOutput.java
 ---
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more 
contributor license
+ * agreements. See the NOTICE file distributed with this work for 
additional information regarding
+ * copyright ownership. The ASF licenses this file to you under the Apache 
License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the 
License. You may obtain a
+ * copy of the License at http://www.apache.org/licenses/LICENSE-2.0 
Unless required by applicable
+ * law or agreed to in writing, software distributed under the License is 
distributed on an "AS IS"
+ * BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied. See the License
+ * for the specific language governing permissions and limitations under 
the License.
+ */
+package org.apache.phoenix.mapreduce.index;
+
+import static org.junit.Assert.assertEquals;
+
+import java.sql.SQLException;
+import java.util.Arrays;
+
+import org.apache.phoenix.mapreduce.util.IndexColumnNames;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestIndexScrutinyTableOutput extends BaseIndexTest {
+
+private static final long SCRUTINY_TIME_MILLIS = 1502908914193L;
+
+@Before
+public void setup() throws Exception {
+super.setup();
+
conn.createStatement().execute(IndexScrutinyTableOutput.OUTPUT_TABLE_DDL);
+
conn.createStatement().execute(IndexScrutinyTableOutput.OUTPUT_METADATA_DDL);
+conn.commit();
--- End diff --

Minor nit: commit() not necessary after DDL


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (PHOENIX-3655) Metrics for PQS

2017-08-22 Thread Rahul Shrivastava (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137581#comment-16137581
 ] 

Rahul Shrivastava commented on PHOENIX-3655:


[~elserj] [~jamestaylor] [~samarthjain]

Hi All,

I guess we may start the discussion on how we want the metrics collection to be 
designed for Phoenix Query Server. 

I would lay down options and you can provide your input to them ( or even add 
another option)

1. Write a layer to convert from the Phoenix internal representation to the 
metrics system of choice (a shim). -- as suggested by [~elserj]
2. write the request level metrics at event close ( statement close, connection 
close ) and push the data down to Phoenix tables itself. We would create 
bootstrap tables in Phoenix and write down every request/global level metrics.  
That way, we will have options to collect the metrics later by querying the 
tables. We can have Phoenix level parameter which will limit the length of time 
metrics could be retained. 

Please advise.

thanks
Rahul


> Metrics for PQS
> ---
>
> Key: PHOENIX-3655
> URL: https://issues.apache.org/jira/browse/PHOENIX-3655
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.8.0
> Environment: Linux 3.13.0-107-generic kernel, v4.9.0-HBase-0.98
>Reporter: Rahul Shrivastava
>Assignee: Rahul Shrivastava
> Fix For: 4.12.0
>
> Attachments: MetricsforPhoenixQueryServerPQS.pdf
>
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> Phoenix Query Server runs a separate process compared to its thin client. 
> Metrics collection is currently done by PhoenixRuntime.java i.e. at Phoenix 
> driver level. We need the following
> 1. For every jdbc statement/prepared statement/ run by PQS , we need 
> capability to collect metrics at PQS level and push the data to external sink 
> i.e. file, JMX , other external custom sources. 
> 2. Besides this global metrics could be periodically collected and pushed to 
> the sink. 
> 2. PQS can be configured to turn on metrics collection and type of collect ( 
> runtime or global) via hbase-site.xml
> 3. Sink could be configured via an interface in hbase-site.xml. 
> All metrics definition https://phoenix.apache.org/metrics.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4114) Derive ConcurrentMutationsIT from ParallelStatsDisabledIT

2017-08-22 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4114:
--
Attachment: PHOENIX-4114.patch

> Derive ConcurrentMutationsIT from ParallelStatsDisabledIT
> -
>
> Key: PHOENIX-4114
> URL: https://issues.apache.org/jira/browse/PHOENIX-4114
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Attachments: PHOENIX-4114.patch
>
>
> No need to spin up a new mini cluster for that test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4114) Derive ConcurrentMutationsIT from ParallelStatsDisabledIT

2017-08-22 Thread James Taylor (JIRA)
James Taylor created PHOENIX-4114:
-

 Summary: Derive ConcurrentMutationsIT from ParallelStatsDisabledIT
 Key: PHOENIX-4114
 URL: https://issues.apache.org/jira/browse/PHOENIX-4114
 Project: Phoenix
  Issue Type: Bug
Reporter: James Taylor
Assignee: James Taylor


No need to spin up a new mini cluster for that test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4113) Killing forked JVM may cause resources to be not released

2017-08-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137433#comment-16137433
 ] 

Hadoop QA commented on PHOENIX-4113:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12883190/PHOENIX-4113_4.x-HBase-0.98_v2.patch
  against 4.x-HBase-0.98 branch at commit 
16e0511ff3e65d2463ab4481b9dd9a42cdf18461.
  ATTACHMENT ID: 12883190

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1285//console

This message is automatically generated.

> Killing forked JVM may cause resources to be not released
> -
>
> Key: PHOENIX-4113
> URL: https://issues.apache.org/jira/browse/PHOENIX-4113
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-4113_4.x-HBase-0.98_v2.patch
>
>
> We have a kill configured in pom which behind the scenes 
> calls 
> {code}
> java.lang.Runtime.halt(1)
> {code}
> We also have a shutdown hook which is calling halt on the JVM.
> {code}
> private static String checkClusterInitialized(ReadOnlyProps serverProps) 
> throws Exception {
> if (!clusterInitialized) {
> url = setUpTestCluster(config, serverProps);
> clusterInitialized = true;
> Runtime.getRuntime().addShutdownHook(new Thread() {
> @Override
> public void run() {
> logger.info("SHUTDOWN: halting JVM now");
> Runtime.getRuntime().halt(0);
> }
> });
> }
> return url;
> }
> {code}
> This causes JVM to not execute all shutdown hooks which in turn would cause 
> the JVM process to not release all the system resources (network ports, file 
> handles, etc) it was using. If OS is not able to clean up these orphaned 
> resources soon enough, it could cause subsequent new JVM processes to run 
> into resource issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4113) Killing forked JVM may cause resources to be not released

2017-08-22 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137429#comment-16137429
 ] 

James Taylor commented on PHOENIX-4113:
---

The problem I saw in the past in letting the shutdown hooks run is that they 
hang. 

bq. If OS is not able to clean up these orphaned resources soon enough, it 
could cause subsequent new JVM processes to run into resource issues.
When a process dies, the OS is pretty good about cleaning up resources.

Do you see different/better results when you make this change? If not, I don't 
think we should make it.

> Killing forked JVM may cause resources to be not released
> -
>
> Key: PHOENIX-4113
> URL: https://issues.apache.org/jira/browse/PHOENIX-4113
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-4113_4.x-HBase-0.98_v2.patch
>
>
> We have a kill configured in pom which behind the scenes 
> calls 
> {code}
> java.lang.Runtime.halt(1)
> {code}
> We also have a shutdown hook which is calling halt on the JVM.
> {code}
> private static String checkClusterInitialized(ReadOnlyProps serverProps) 
> throws Exception {
> if (!clusterInitialized) {
> url = setUpTestCluster(config, serverProps);
> clusterInitialized = true;
> Runtime.getRuntime().addShutdownHook(new Thread() {
> @Override
> public void run() {
> logger.info("SHUTDOWN: halting JVM now");
> Runtime.getRuntime().halt(0);
> }
> });
> }
> return url;
> }
> {code}
> This causes JVM to not execute all shutdown hooks which in turn would cause 
> the JVM process to not release all the system resources (network ports, file 
> handles, etc) it was using. If OS is not able to clean up these orphaned 
> resources soon enough, it could cause subsequent new JVM processes to run 
> into resource issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4113) Killing forked JVM may cause resources to be not released

2017-08-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137426#comment-16137426
 ] 

Hadoop QA commented on PHOENIX-4113:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12883190/PHOENIX-4113_4.x-HBase-0.98_v2.patch
  against 4.x-HBase-0.98 branch at commit 
16e0511ff3e65d2463ab4481b9dd9a42cdf18461.
  ATTACHMENT ID: 12883190

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1284//console

This message is automatically generated.

> Killing forked JVM may cause resources to be not released
> -
>
> Key: PHOENIX-4113
> URL: https://issues.apache.org/jira/browse/PHOENIX-4113
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-4113_4.x-HBase-0.98_v2.patch
>
>
> We have a kill configured in pom which behind the scenes 
> calls 
> {code}
> java.lang.Runtime.halt(1)
> {code}
> We also have a shutdown hook which is calling halt on the JVM.
> {code}
> private static String checkClusterInitialized(ReadOnlyProps serverProps) 
> throws Exception {
> if (!clusterInitialized) {
> url = setUpTestCluster(config, serverProps);
> clusterInitialized = true;
> Runtime.getRuntime().addShutdownHook(new Thread() {
> @Override
> public void run() {
> logger.info("SHUTDOWN: halting JVM now");
> Runtime.getRuntime().halt(0);
> }
> });
> }
> return url;
> }
> {code}
> This causes JVM to not execute all shutdown hooks which in turn would cause 
> the JVM process to not release all the system resources (network ports, file 
> handles, etc) it was using. If OS is not able to clean up these orphaned 
> resources soon enough, it could cause subsequent new JVM processes to run 
> into resource issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4113) Killing forked JVM may cause resources to be not released

2017-08-22 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4113:
--
Attachment: PHOENIX-4113_4.x-HBase-0.98_v2.patch

> Killing forked JVM may cause resources to be not released
> -
>
> Key: PHOENIX-4113
> URL: https://issues.apache.org/jira/browse/PHOENIX-4113
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-4113_4.x-HBase-0.98_v2.patch
>
>
> We have a kill configured in pom which behind the scenes 
> calls 
> {code}
> java.lang.Runtime.halt(1)
> {code}
> We also have a shutdown hook which is calling halt on the JVM.
> {code}
> private static String checkClusterInitialized(ReadOnlyProps serverProps) 
> throws Exception {
> if (!clusterInitialized) {
> url = setUpTestCluster(config, serverProps);
> clusterInitialized = true;
> Runtime.getRuntime().addShutdownHook(new Thread() {
> @Override
> public void run() {
> logger.info("SHUTDOWN: halting JVM now");
> Runtime.getRuntime().halt(0);
> }
> });
> }
> return url;
> }
> {code}
> This causes JVM to not execute all shutdown hooks which in turn would cause 
> the JVM process to not release all the system resources (network ports, file 
> handles, etc) it was using. If OS is not able to clean up these orphaned 
> resources soon enough, it could cause subsequent new JVM processes to run 
> into resource issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4113) Killing forked JVM may cause resources to be not released

2017-08-22 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4113:
--
Attachment: (was: PHOENIX-4113_4.x-HBase-0.98.patch)

> Killing forked JVM may cause resources to be not released
> -
>
> Key: PHOENIX-4113
> URL: https://issues.apache.org/jira/browse/PHOENIX-4113
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
>
> We have a kill configured in pom which behind the scenes 
> calls 
> {code}
> java.lang.Runtime.halt(1)
> {code}
> We also have a shutdown hook which is calling halt on the JVM.
> {code}
> private static String checkClusterInitialized(ReadOnlyProps serverProps) 
> throws Exception {
> if (!clusterInitialized) {
> url = setUpTestCluster(config, serverProps);
> clusterInitialized = true;
> Runtime.getRuntime().addShutdownHook(new Thread() {
> @Override
> public void run() {
> logger.info("SHUTDOWN: halting JVM now");
> Runtime.getRuntime().halt(0);
> }
> });
> }
> return url;
> }
> {code}
> This causes JVM to not execute all shutdown hooks which in turn would cause 
> the JVM process to not release all the system resources (network ports, file 
> handles, etc) it was using. If OS is not able to clean up these orphaned 
> resources soon enough, it could cause subsequent new JVM processes to run 
> into resource issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4113) Killing forked JVM may cause resources to be not released

2017-08-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137416#comment-16137416
 ] 

Hadoop QA commented on PHOENIX-4113:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12883189/PHOENIX-4113_4.x-HBase-0.98.patch
  against 4.x-HBase-0.98 branch at commit 
16e0511ff3e65d2463ab4481b9dd9a42cdf18461.
  ATTACHMENT ID: 12883189

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1282//console

This message is automatically generated.

> Killing forked JVM may cause resources to be not released
> -
>
> Key: PHOENIX-4113
> URL: https://issues.apache.org/jira/browse/PHOENIX-4113
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-4113_4.x-HBase-0.98.patch
>
>
> We have a kill configured in pom which behind the scenes 
> calls 
> {code}
> java.lang.Runtime.halt(1)
> {code}
> We also have a shutdown hook which is calling halt on the JVM.
> {code}
> private static String checkClusterInitialized(ReadOnlyProps serverProps) 
> throws Exception {
> if (!clusterInitialized) {
> url = setUpTestCluster(config, serverProps);
> clusterInitialized = true;
> Runtime.getRuntime().addShutdownHook(new Thread() {
> @Override
> public void run() {
> logger.info("SHUTDOWN: halting JVM now");
> Runtime.getRuntime().halt(0);
> }
> });
> }
> return url;
> }
> {code}
> This causes JVM to not execute all shutdown hooks which in turn would cause 
> the JVM process to not release all the system resources (network ports, file 
> handles, etc) it was using. If OS is not able to clean up these orphaned 
> resources soon enough, it could cause subsequent new JVM processes to run 
> into resource issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4113) Killing forked JVM may cause resources to be not released

2017-08-22 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4113:
--
Attachment: PHOENIX-4113_4.x-HBase-0.98.patch

> Killing forked JVM may cause resources to be not released
> -
>
> Key: PHOENIX-4113
> URL: https://issues.apache.org/jira/browse/PHOENIX-4113
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-4113_4.x-HBase-0.98.patch
>
>
> We have a kill configured in pom which behind the scenes 
> calls 
> {code}
> java.lang.Runtime.halt(1)
> {code}
> We also have a shutdown hook which is calling halt on the JVM.
> {code}
> private static String checkClusterInitialized(ReadOnlyProps serverProps) 
> throws Exception {
> if (!clusterInitialized) {
> url = setUpTestCluster(config, serverProps);
> clusterInitialized = true;
> Runtime.getRuntime().addShutdownHook(new Thread() {
> @Override
> public void run() {
> logger.info("SHUTDOWN: halting JVM now");
> Runtime.getRuntime().halt(0);
> }
> });
> }
> return url;
> }
> {code}
> This causes JVM to not execute all shutdown hooks which in turn would cause 
> the JVM process to not release all the system resources (network ports, file 
> handles, etc) it was using. If OS is not able to clean up these orphaned 
> resources soon enough, it could cause subsequent new JVM processes to run 
> into resource issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4113) Killing forked JVM may cause resources to be not released

2017-08-22 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4113:
--
Attachment: (was: PHOENIX-4113_4.x-HBase-0.98.patch)

> Killing forked JVM may cause resources to be not released
> -
>
> Key: PHOENIX-4113
> URL: https://issues.apache.org/jira/browse/PHOENIX-4113
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-4113_4.x-HBase-0.98.patch
>
>
> We have a kill configured in pom which behind the scenes 
> calls 
> {code}
> java.lang.Runtime.halt(1)
> {code}
> We also have a shutdown hook which is calling halt on the JVM.
> {code}
> private static String checkClusterInitialized(ReadOnlyProps serverProps) 
> throws Exception {
> if (!clusterInitialized) {
> url = setUpTestCluster(config, serverProps);
> clusterInitialized = true;
> Runtime.getRuntime().addShutdownHook(new Thread() {
> @Override
> public void run() {
> logger.info("SHUTDOWN: halting JVM now");
> Runtime.getRuntime().halt(0);
> }
> });
> }
> return url;
> }
> {code}
> This causes JVM to not execute all shutdown hooks which in turn would cause 
> the JVM process to not release all the system resources (network ports, file 
> handles, etc) it was using. If OS is not able to clean up these orphaned 
> resources soon enough, it could cause subsequent new JVM processes to run 
> into resource issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4113) Killing forked JVM may cause resources to be not released

2017-08-22 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-4113:
--
Attachment: PHOENIX-4113_4.x-HBase-0.98.patch

> Killing forked JVM may cause resources to be not released
> -
>
> Key: PHOENIX-4113
> URL: https://issues.apache.org/jira/browse/PHOENIX-4113
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-4113_4.x-HBase-0.98.patch
>
>
> We have a kill configured in pom which behind the scenes 
> calls 
> {code}
> java.lang.Runtime.halt(1)
> {code}
> We also have a shutdown hook which is calling halt on the JVM.
> {code}
> private static String checkClusterInitialized(ReadOnlyProps serverProps) 
> throws Exception {
> if (!clusterInitialized) {
> url = setUpTestCluster(config, serverProps);
> clusterInitialized = true;
> Runtime.getRuntime().addShutdownHook(new Thread() {
> @Override
> public void run() {
> logger.info("SHUTDOWN: halting JVM now");
> Runtime.getRuntime().halt(0);
> }
> });
> }
> return url;
> }
> {code}
> This causes JVM to not execute all shutdown hooks which in turn would cause 
> the JVM process to not release all the system resources (network ports, file 
> handles, etc) it was using. If OS is not able to clean up these orphaned 
> resources soon enough, it could cause subsequent new JVM processes to run 
> into resource issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (PHOENIX-418) Support approximate COUNT DISTINCT

2017-08-22 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137402#comment-16137402
 ] 

James Taylor edited comment on PHOENIX-418 at 8/22/17 8:59 PM:
---

Thanks for the revised patch, [~aertoria]. Looks very good. A couple of minor 
things:
- derive your test from ParallelStatsDisabledIT instead of 
BaseUniqueNamesOwnClusterIT and remove the setup method which you won't need. 
The advantage of ParallelStatsDisabledIT tests are that they don't need to each 
spin up a new mini cluster when they run so are overall test run time stays 
lower.
{code}
+public class CountDistinctApproximateHyperLogLogIT extends 
BaseUniqueNamesOwnClusterIT {
+@BeforeClass
+public static void doSetup() throws Exception {
+Map props = Maps.newHashMapWithExpectedSize(3);
+setUpTestDriver(new ReadOnlyProps(props.entrySet().iterator()));
+}
+
{code}
- I think it also makes sense to have another test derived from 
ParallelStatsEnabledIT. This base test class is configured to collect 
statistics. In this way, you can get more test coverage. You can likely run the 
exact same tests, but in this case you'll have guideposts in place (because 
stats will be collected). Make sure to call TestUtil.analyzeTable(connection, 
fullTableName) prior to running your TABLESAMPLE queries. You'll get more rows 
back, since you'll have guideposts in addition to region boundaries.
- minor nit, extra semicolon here:
{code}
+
DistinctCountHyperLogLogAggregateFunction(DistinctCountHyperLogLogAggregateFunction.class);;
{code}
- Instead of always copying the underlying byte buffer, use 
ByteUtil.copyKeyBytesIfNecessary(ImmutableBytesWritable ptr) instead which only 
copies when necessary:
{code}
+   @Override
+   public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) {  
+   try {
+   valueByteArray.set(hll.getBytes(), 0, 
hll.getBytes().length);
+   ptr.set(valueByteArray.copyBytes());
{code}


was (Author: jamestaylor):
Thanks for the revised patch, [~aertoria]. Looks very good. A couple of minor 
things:
- derive your test from ParallelStatsDisabledIT instead of 
BaseUniqueNamesOwnClusterIT and remove the setup method which you won't need. 
The advantage of ParallelStatsDisabledIT tests are that they don't need to each 
spin up a new mini cluster when they run so are overall test run time stays 
lower.
{code}
+public class CountDistinctApproximateHyperLogLogIT extends 
BaseUniqueNamesOwnClusterIT {
+@BeforeClass
+public static void doSetup() throws Exception {
+Map props = Maps.newHashMapWithExpectedSize(3);
+setUpTestDriver(new ReadOnlyProps(props.entrySet().iterator()));
+}
+
{code}
- I think it also makes sense to have another test derived from 
ParallelStatsEnabledIT. This base test class is configured to collect 
statistics. In this way, you can get more test coverage. You can likely run the 
exact same tests, but in this case you'll have guideposts in place (because 
stats will be collected). Make sure to call TestUtil.analyzeTable(connection, 
fullTableName) prior to running your TABLESAMPLE queries. You'll get more rows 
back, since you'll have guideposts in addition to region boundaries.
- minor nit, extra semicolon here:
{code}
+
DistinctCountHyperLogLogAggregateFunction(DistinctCountHyperLogLogAggregateFunction.class);;
{code}
- Instead of always copying the underlying byte buffer, use 
ByteUtil.copyKeyBytesIfNecessary(ImmutableBytesWritable ptr) instead which only 
copies when necessary:
+   @Override
+   public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) {  
+   try {
+   valueByteArray.set(hll.getBytes(), 0, 
hll.getBytes().length);
+   ptr.set(valueByteArray.copyBytes());
{code}

> Support approximate COUNT DISTINCT
> --
>
> Key: PHOENIX-418
> URL: https://issues.apache.org/jira/browse/PHOENIX-418
> Project: Phoenix
>  Issue Type: Task
>Reporter: James Taylor
>Assignee: Ethan Wang
>  Labels: gsoc2016
> Attachments: PHOENIX-418-v1.patch, PHOENIX-418-v2.patch, 
> PHOENIX-418-v3.patch, PHOENIX-418-v4.patch
>
>
> Support an "approximation" of count distinct to prevent having to hold on to 
> all distinct values (since this will not scale well when the number of 
> distinct values is huge). The Apache Drill folks have had some interesting 
> discussions on this 
> [here](http://mail-archives.apache.org/mod_mbox/incubator-drill-dev/201306.mbox/%3CJIRA.12650169.1369931282407.88049.1370645900553%40arcas%3E).
>  They recommend using  [Welford's 
> method](http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance_Online_algorithm).
>  I'm open 

[jira] [Commented] (PHOENIX-418) Support approximate COUNT DISTINCT

2017-08-22 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137402#comment-16137402
 ] 

James Taylor commented on PHOENIX-418:
--

Thanks for the revised patch, [~aertoria]. Looks very good. A couple of minor 
things:
- derive your test from ParallelStatsDisabledIT instead of 
BaseUniqueNamesOwnClusterIT and remove the setup method which you won't need. 
The advantage of ParallelStatsDisabledIT tests are that they don't need to each 
spin up a new mini cluster when they run so are overall test run time stays 
lower.
{code}
+public class CountDistinctApproximateHyperLogLogIT extends 
BaseUniqueNamesOwnClusterIT {
+@BeforeClass
+public static void doSetup() throws Exception {
+Map props = Maps.newHashMapWithExpectedSize(3);
+setUpTestDriver(new ReadOnlyProps(props.entrySet().iterator()));
+}
+
{code}
- I think it also makes sense to have another test derived from 
ParallelStatsEnabledIT. This base test class is configured to collect 
statistics. In this way, you can get more test coverage. You can likely run the 
exact same tests, but in this case you'll have guideposts in place (because 
stats will be collected). Make sure to call TestUtil.analyzeTable(connection, 
fullTableName) prior to running your TABLESAMPLE queries. You'll get more rows 
back, since you'll have guideposts in addition to region boundaries.
- minor nit, extra semicolon here:
{code}
+
DistinctCountHyperLogLogAggregateFunction(DistinctCountHyperLogLogAggregateFunction.class);;
{code}
- Instead of always copying the underlying byte buffer, use 
ByteUtil.copyKeyBytesIfNecessary(ImmutableBytesWritable ptr) instead which only 
copies when necessary:
+   @Override
+   public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) {  
+   try {
+   valueByteArray.set(hll.getBytes(), 0, 
hll.getBytes().length);
+   ptr.set(valueByteArray.copyBytes());
{code}

> Support approximate COUNT DISTINCT
> --
>
> Key: PHOENIX-418
> URL: https://issues.apache.org/jira/browse/PHOENIX-418
> Project: Phoenix
>  Issue Type: Task
>Reporter: James Taylor
>Assignee: Ethan Wang
>  Labels: gsoc2016
> Attachments: PHOENIX-418-v1.patch, PHOENIX-418-v2.patch, 
> PHOENIX-418-v3.patch, PHOENIX-418-v4.patch
>
>
> Support an "approximation" of count distinct to prevent having to hold on to 
> all distinct values (since this will not scale well when the number of 
> distinct values is huge). The Apache Drill folks have had some interesting 
> discussions on this 
> [here](http://mail-archives.apache.org/mod_mbox/incubator-drill-dev/201306.mbox/%3CJIRA.12650169.1369931282407.88049.1370645900553%40arcas%3E).
>  They recommend using  [Welford's 
> method](http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance_Online_algorithm).
>  I'm open to having a config option that uses exact versus approximate. I 
> don't have experience implementing an approximate implementation, so I'm not 
> sure how much state is required to keep on the server and return to the 
> client (other than realizing it'd be much less that returning all distinct 
> values and their counts).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4113) Killing forked JVM may cause resources to be not released

2017-08-22 Thread Samarth Jain (JIRA)
Samarth Jain created PHOENIX-4113:
-

 Summary: Killing forked JVM may cause resources to be not released
 Key: PHOENIX-4113
 URL: https://issues.apache.org/jira/browse/PHOENIX-4113
 Project: Phoenix
  Issue Type: Bug
Reporter: Samarth Jain
Assignee: Samarth Jain


We have a kill configured in pom which behind the scenes 
calls 
{code}
java.lang.Runtime.halt(1)
{code}

We also have a shutdown hook which is calling halt on the JVM.
{code}
private static String checkClusterInitialized(ReadOnlyProps serverProps) throws 
Exception {
if (!clusterInitialized) {
url = setUpTestCluster(config, serverProps);
clusterInitialized = true;
Runtime.getRuntime().addShutdownHook(new Thread() {
@Override
public void run() {
logger.info("SHUTDOWN: halting JVM now");
Runtime.getRuntime().halt(0);
}
});
}
return url;
}
{code}

This causes JVM to not execute all shutdown hooks which in turn would cause the 
JVM process to not release all the system resources (network ports, file 
handles, etc) it was using. If OS is not able to clean up these orphaned 
resources soon enough, it could cause subsequent new JVM processes to run into 
resource issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3999) Optimize inner joins as SKIP-SCAN-JOIN when possible

2017-08-22 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137065#comment-16137065
 ] 

James Taylor commented on PHOENIX-3999:
---

Just to confirm, [~aertoria], this query:
{code}
SELECT 
i.ITEM_TYPE, b.BATCH_SEQUENCE_NUM, i.ITEM_ID, i.ITEM_VALUE   
FROM  
ITEMS i, COMPLETED_BATCHES b
WHERE 
   b.BATCH_ID = i.BATCH_ID
   AND b.BATCH_SEQUENCE_NUM > 0 
   AND b.BATCH_SEQUENCE_NUM < 2;
{code}
does not use a skip scan from COMPLETED_BATCHES into ITEMS?

Yes, it'll be way more efficient to do a skip scan here, as is done in the case 
of the semi join.

FYI, [~maryannxue].

> Optimize inner joins as SKIP-SCAN-JOIN when possible
> 
>
> Key: PHOENIX-3999
> URL: https://issues.apache.org/jira/browse/PHOENIX-3999
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>
> Semi joins on the leading part of the primary key end up doing batches of 
> point queries (as opposed to a broadcast hash join), however inner joins do 
> not.
> Here's a set of example schemas that executes a skip scan on the inner query:
> {code}
> CREATE TABLE COMPLETED_BATCHES (
> BATCH_SEQUENCE_NUM BIGINT NOT NULL,
> BATCH_ID   BIGINT NOT NULL,
> CONSTRAINT PK PRIMARY KEY
> (
> BATCH_SEQUENCE_NUM,
> BATCH_ID
> )
> );
> CREATE TABLE ITEMS (
>BATCH_ID BIGINT NOT NULL,
>ITEM_ID BIGINT NOT NULL,
>ITEM_TYPE BIGINT,
>ITEM_VALUE VARCHAR,
>CONSTRAINT PK PRIMARY KEY
>(
> BATCH_ID,
> ITEM_ID
>)
> );
> CREATE TABLE COMPLETED_ITEMS (
>ITEM_TYPE  BIGINT NOT NULL,
>BATCH_SEQUENCE_NUM BIGINT NOT NULL,
>ITEM_IDBIGINT NOT NULL,
>ITEM_VALUE VARCHAR,
>CONSTRAINT PK PRIMARY KEY
>(
>   ITEM_TYPE,
>   BATCH_SEQUENCE_NUM,  
>   ITEM_ID
>)
> );
> {code}
> The explain plan of these indicate that a dynamic filter will be performed 
> like this:
> {code}
> UPSERT SELECT
> CLIENT PARALLEL 1-WAY FULL SCAN OVER ITEMS
> SKIP-SCAN-JOIN TABLE 0
> CLIENT PARALLEL 1-WAY RANGE SCAN OVER COMPLETED_BATCHES [1] - [2]
> SERVER FILTER BY FIRST KEY ONLY
> SERVER AGGREGATE INTO DISTINCT ROWS BY [BATCH_ID]
> CLIENT MERGE SORT
> DYNAMIC SERVER FILTER BY I.BATCH_ID IN ($8.$9)
> {code}
> We should also be able to leverage this optimization when an inner join is 
> used such as this:
> {code}
> UPSERT INTO COMPLETED_ITEMS (ITEM_TYPE, BATCH_SEQUENCE_NUM, ITEM_ID, 
> ITEM_VALUE)
>SELECT i.ITEM_TYPE, b.BATCH_SEQUENCE_NUM, i.ITEM_ID, i.ITEM_VALUE   
>FROM  ITEMS i, COMPLETED_BATCHES b
>WHERE b.BATCH_ID = i.BATCH_ID AND  
>b.BATCH_SEQUENCE_NUM > 1000 AND b.BATCH_SEQUENCE_NUM < 2000;
> {code}
> A complete unit test looks like this:
> {code}
> @Test
> public void testNestedLoopJoin() throws Exception {
> try (Connection conn = DriverManager.getConnection(getUrl())) {
> String t1="COMPLETED_BATCHES";
> String ddl1 = "CREATE TABLE " + t1 + " (\n" + 
> "BATCH_SEQUENCE_NUM BIGINT NOT NULL,\n" + 
> "BATCH_ID   BIGINT NOT NULL,\n" + 
> "CONSTRAINT PK PRIMARY KEY\n" + 
> "(\n" + 
> "BATCH_SEQUENCE_NUM,\n" + 
> "BATCH_ID\n" + 
> ")\n" + 
> ")" + 
> "";
> conn.createStatement().execute(ddl1);
> 
> String t2="ITEMS";
> String ddl2 = "CREATE TABLE " + t2 + " (\n" + 
> "   BATCH_ID BIGINT NOT NULL,\n" + 
> "   ITEM_ID BIGINT NOT NULL,\n" + 
> "   ITEM_TYPE BIGINT,\n" + 
> "   ITEM_VALUE VARCHAR,\n" + 
> "   CONSTRAINT PK PRIMARY KEY\n" + 
> "   (\n" + 
> "BATCH_ID,\n" + 
> "ITEM_ID\n" + 
> "   )\n" + 
> ")";
> conn.createStatement().execute(ddl2);
> String t3="COMPLETED_ITEMS";
> String ddl3 = "CREATE TABLE " + t3 + "(\n" + 
> "   ITEM_TYPE  BIGINT NOT NULL,\n" + 
> "   BATCH_SEQUENCE_NUM BIGINT NOT NULL,\n" + 
> "   ITEM_IDBIGINT NOT NULL,\n" + 
> "   ITEM_VALUE VARCHAR,\n" + 
> "   CONSTRAINT PK PRIMARY KEY\n" + 
> "   (\n" + 
> "  ITEM_TYPE,\n" + 
> "  BATCH_SEQUENCE_NUM,  \n" + 
> "  ITEM_ID\n" + 
> "   )\n" + 
> ")";
>  

[jira] [Commented] (PHOENIX-4112) Allow JDBC url-based Kerberos credentials via sqlline-thin.py

2017-08-22 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136977#comment-16136977
 ] 

Josh Elser commented on PHOENIX-4112:
-

Thanks for filing, Sunil.

The core of this issue is that SqllineWrapper, the Java class that 
sqlline-thin.py invokes, requires that the user already have a populated ticket 
cache. There's no reason that we can't allow the user to simply provide 
credentials and log them in automatically.

Avatica has the ability to perform the login already at the JAAS level; 
however, I don't know if this will "play nicely" with UserGroupInformation.

> Allow JDBC url-based Kerberos credentials via sqlline-thin.py
> -
>
> Key: PHOENIX-4112
> URL: https://issues.apache.org/jira/browse/PHOENIX-4112
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Sunil Kumar Sattiraju
>
> In a kerberized environment sqlline-thin.py support authenticating after the 
> user loads keytab using kinit. phoneix jdbc thin driver supports using keytab 
> like below:
> jdbc:phoenix:thin:url=http://queryserver.domain:8765;serialization=PROTOBUF;authentication=SPENGO;principal=phoe...@example.com;keytab=/etc/security/keytabs/phoenix.keytab
> so, improve SqllineWrapper to support these args



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4112) Allow JDBC url-based Kerberos credentials via sqlline-thin.py

2017-08-22 Thread Josh Elser (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated PHOENIX-4112:

Summary: Allow JDBC url-based Kerberos credentials via sqlline-thin.py  
(was: Support sqlline-thin.py to use keytab for authentication to Phoenix query 
server -  improve SqllineWrapper)

> Allow JDBC url-based Kerberos credentials via sqlline-thin.py
> -
>
> Key: PHOENIX-4112
> URL: https://issues.apache.org/jira/browse/PHOENIX-4112
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Sunil Kumar Sattiraju
>
> In a kerberized environment sqlline-thin.py support authenticating after the 
> user loads keytab using kinit. phoneix jdbc thin driver supports using keytab 
> like below:
> jdbc:phoenix:thin:url=http://queryserver.domain:8765;serialization=PROTOBUF;authentication=SPENGO;principal=phoe...@example.com;keytab=/etc/security/keytabs/phoenix.keytab
> so, improve SqllineWrapper to support these args



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-2048) change to_char() function to use HALF_UP rounding mode

2017-08-22 Thread Csaba Skrabak (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Skrabak updated PHOENIX-2048:
---
Attachment: PHOENIX-2048_v2.patch

> change to_char() function to use HALF_UP rounding mode
> --
>
> Key: PHOENIX-2048
> URL: https://issues.apache.org/jira/browse/PHOENIX-2048
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: verify
>Reporter: Jonathan Leech
>Assignee: Csaba Skrabak
>Priority: Minor
> Fix For: 4.12.0
>
> Attachments: PHOENIX-2048.patch, PHOENIX-2048_v2.patch
>
>
> to_char() function uses the default rounding mode in java DecimalFormat, 
> which is a strange one called HALF_EVEN, which rounds a '5' in the last 
> position either up or down depending on the preceding digit. 
> Change it to HALF_UP so it rounds the same way as the round() function does, 
> or provide a way to override the behavior; e.g. globally or as a client 
> config, or an argument to the to_char() function.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-2048) change to_char() function to use HALF_UP rounding mode

2017-08-22 Thread Csaba Skrabak (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136845#comment-16136845
 ] 

Csaba Skrabak commented on PHOENIX-2048:


Help, @hadoopqa lies:
* phoenix-core/src/it/java/org/apache/phoenix/end2end/ToCharFunctionIT.java 
_is_ a test and modification of it is included.
* There is no Javadoc warning about the modified files in the linked txt.
* Yes, I can break the long lines.

> change to_char() function to use HALF_UP rounding mode
> --
>
> Key: PHOENIX-2048
> URL: https://issues.apache.org/jira/browse/PHOENIX-2048
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: verify
>Reporter: Jonathan Leech
>Assignee: Csaba Skrabak
>Priority: Minor
> Fix For: 4.12.0
>
> Attachments: PHOENIX-2048.patch
>
>
> to_char() function uses the default rounding mode in java DecimalFormat, 
> which is a strange one called HALF_EVEN, which rounds a '5' in the last 
> position either up or down depending on the preceding digit. 
> Change it to HALF_UP so it rounds the same way as the round() function does, 
> or provide a way to override the behavior; e.g. globally or as a client 
> config, or an argument to the to_char() function.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4112) Support sqlline-thin.py to use keytab for authentication to Phoenix query server - improve SqllineWrapper

2017-08-22 Thread Sunil Kumar Sattiraju (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Kumar Sattiraju updated PHOENIX-4112:
---
Affects Version/s: (was: 4.11.0)
   (was: 4.10.0)
   (was: 4.9.0)

> Support sqlline-thin.py to use keytab for authentication to Phoenix query 
> server -  improve SqllineWrapper
> --
>
> Key: PHOENIX-4112
> URL: https://issues.apache.org/jira/browse/PHOENIX-4112
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Sunil Kumar Sattiraju
>
> In a kerberized environment sqlline-thin.py support authenticating after the 
> user loads keytab using kinit. phoneix jdbc thin driver supports using keytab 
> like below:
> jdbc:phoenix:thin:url=http://queryserver.domain:8765;serialization=PROTOBUF;authentication=SPENGO;principal=phoe...@example.com;keytab=/etc/security/keytabs/phoenix.keytab
> so, improve SqllineWrapper to support these args



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4112) Support sqlline-thin.py to use keytab for authentication to Phoenix query server - improve SqllineWrapper

2017-08-22 Thread Sunil Kumar Sattiraju (JIRA)
Sunil Kumar Sattiraju created PHOENIX-4112:
--

 Summary: Support sqlline-thin.py to use keytab for authentication 
to Phoenix query server -  improve SqllineWrapper
 Key: PHOENIX-4112
 URL: https://issues.apache.org/jira/browse/PHOENIX-4112
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 4.11.0, 4.10.0, 4.9.0
Reporter: Sunil Kumar Sattiraju


In a kerberized environment sqlline-thin.py support authenticating after the 
user loads keytab using kinit. phoneix jdbc thin driver supports using keytab 
like below:

jdbc:phoenix:thin:url=http://queryserver.domain:8765;serialization=PROTOBUF;authentication=SPENGO;principal=phoe...@example.com;keytab=/etc/security/keytabs/phoenix.keytab

so, improve SqllineWrapper to support these args





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4076) Move master branch up to HBase 1.4.0-SNAPSHOT

2017-08-22 Thread Rajeshbabu Chintaguntla (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136368#comment-16136368
 ] 

Rajeshbabu Chintaguntla commented on PHOENIX-4076:
--

[~jamestaylor] The IndexHalfStoreFileReaderGenerator changes are fine in the 
patch.

> Move master branch up to HBase 1.4.0-SNAPSHOT
> -
>
> Key: PHOENIX-4076
> URL: https://issues.apache.org/jira/browse/PHOENIX-4076
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Attachments: PHOENIX-4076.patch
>
>
> Move master branch up to HBase 1.4.0-SNAPSHOT. 
> There are some compilation problems. 
> Valid compatibility breaks are addressed and fixed by HBASE-18431. This 
> analysis is a compilation attempt of Phoenix master branch against 
> 1.4.0-SNAPSHOT artifacts including the HBASE-18431 changes. 
> HBASE-16584 removed PayloadCarryingRpcController, breaking compilation of 
> MetadataRpcController, InterRegionServerIndexRpcControllerFactory, 
> IndexRpcController, ClientRpcControllerFactory, and 
> InterRegionServerMetadataRpcControllerFactory. This class was annotated as 
> Private so was fair game to remove. It will be gone in HBase 1.4.x and up. 
> DelegateRegionObserver needs to implement added interface method 
> postCommitStoreFile.
> DelegateHTable, TephraTransactionTable, and OmidTransactionTable need to 
> implement added interface methods for getting and setting read and write 
> timeouts. 
> PhoenixRpcScheduler needs to implement added interface methods for getting 
> handler counts. 
> Store file readers/writers/scanners have been refactored and the local index 
> implementation, which implements or overrides parts of this refactored 
> hierarchy will have to also be refactored.
> DelegateRegionCoprocessorEnvironment needs to implement added method 
> getMetricRegistryForRegionServer
> Another issue with IndexRpcController: incompatible types: int cannot be 
> converted to org.apache.hadoop.hbase.TableName



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)