[GitHub] incubator-hawq pull request #1379: HAWQ-1622. Cache PXF proxy UGI so that cl...

2018-07-11 Thread benchristel
Github user benchristel commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/1379#discussion_r201877270
  
--- Diff: 
pxf/pxf-service/src/test/java/org/apache/hawq/pxf/service/UGICacheTest.java ---
@@ -0,0 +1,197 @@
+package org.apache.hawq.pxf.service;
+
+import org.apache.hadoop.security.UserGroupInformation;
+import org.junit.Before;
+import org.junit.Test;
+import org.mockito.stubbing.Answer;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotEquals;
+import static org.junit.Assert.assertNotNull;
+import static org.mockito.Matchers.any;
+import static org.mockito.Mockito.*;
+import static org.powermock.api.mockito.PowerMockito.when;
+
+public class UGICacheTest {
+private UGIProvider provider = null;
+private SessionId session = null;
+private UGICache cache = null;
+
+@Before
+public void setUp() throws Exception {
+provider = mock(UGIProvider.class);
+
+
when(provider.createProxyUGI(any(String.class))).thenAnswer((Answer)
 invocation -> mock(UserGroupInformation.class));
+
+session = new SessionId(0, "txn-id", "the-user");
+
+cache = new UGICache(provider);
+}
+
+@Test
+public void getUGIFromEmptyCache() throws Exception {
+UGICacheEntry entry = cache.getTimedProxyUGI(session);
+assertNotNull(entry.getUGI());
+verify(provider).createProxyUGI("the-user");
+}
+
+@Test
+public void getSameUGITwiceUsesCache() throws Exception {
+UGICacheEntry entry1 = cache.getTimedProxyUGI(session);
+UGICacheEntry entry2 = cache.getTimedProxyUGI(session);
+assertEquals(entry1, entry2);
+assertNotNull(entry1.getUGI());
+verify(provider, times(1)).createProxyUGI("the-user");
+}
+
+@Test
+public void getTwoUGIsWithDifferentSessionsForSameUser() throws 
Exception {
+SessionId otherSession = new SessionId(0, "txn-id-2", "the-user");
+UGICacheEntry proxyUGI1 = cache.getTimedProxyUGI(session);
+UGICacheEntry proxyUGI2 = cache.getTimedProxyUGI(otherSession);
+assertNotEquals(proxyUGI1, proxyUGI2);
+verify(provider, times(2)).createProxyUGI("the-user");
+// TODO: this seems weird. We're creating two UGIs with the same 
params,
+// even though we have two different sessions. Why?
+}
+
+@Test
+public void getTwoUGIsWithDifferentUsers() throws Exception {
+SessionId otherSession = new SessionId(0, "txn-id", 
"different-user");
+UGICacheEntry proxyUGI1 = cache.getTimedProxyUGI(session);
+UGICacheEntry proxyUGI2 = cache.getTimedProxyUGI(otherSession);
+assertNotEquals(proxyUGI1, proxyUGI2);
+verify(provider, times(1)).createProxyUGI("the-user");
+verify(provider, times(1)).createProxyUGI("different-user");
+}
+
+@Test
+public void getTwoUGIsWithDifferentUsersCachesBoth() throws Exception {
+SessionId otherSession = new SessionId(0, "txn-id", 
"different-user");
+UGICacheEntry proxyUGI1a = cache.getTimedProxyUGI(session);
+UGICacheEntry proxyUGI1b = cache.getTimedProxyUGI(session);
+UGICacheEntry proxyUGI2a = cache.getTimedProxyUGI(otherSession);
+UGICacheEntry proxyUGI2b = cache.getTimedProxyUGI(otherSession);
+assertEquals(proxyUGI1a, proxyUGI1b);
+assertEquals(proxyUGI2a, proxyUGI2b);
+assertNotEquals(proxyUGI1a, proxyUGI2a);
+verify(provider, times(1)).createProxyUGI("the-user");
+verify(provider, times(1)).createProxyUGI("different-user");
+}
+
+@Test
+public void getUGIWhenRequestedUserDoesNotExist() throws Exception {
+// what does UserGroupInformation.createProxyUser() do in this 
scenario?
+// how about getLoginUser()?
+}
+
+@Test
+public void anySegmentIdIsValid() throws Exception {
+session = new SessionId(65, "txn-id", "the-user");
--- End diff --

We used this test to prove that our switch to using a `ConcurrentHashMap` 
instead of an array fixed the problem. Maybe it can be deleted now (or we could 
change the number 65 to something crazier like 2^31).


---


[GitHub] incubator-hawq pull request #1379: HAWQ-1622. Cache PXF proxy UGI so that cl...

2018-07-11 Thread benchristel
Github user benchristel commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/1379#discussion_r201876799
  
--- Diff: 
pxf/pxf-service/src/test/java/org/apache/hawq/pxf/service/UGICacheTest.java ---
@@ -0,0 +1,197 @@
+package org.apache.hawq.pxf.service;
+
+import org.apache.hadoop.security.UserGroupInformation;
+import org.junit.Before;
+import org.junit.Test;
+import org.mockito.stubbing.Answer;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotEquals;
+import static org.junit.Assert.assertNotNull;
+import static org.mockito.Matchers.any;
+import static org.mockito.Mockito.*;
+import static org.powermock.api.mockito.PowerMockito.when;
+
+public class UGICacheTest {
+private UGIProvider provider = null;
+private SessionId session = null;
+private UGICache cache = null;
+
+@Before
+public void setUp() throws Exception {
+provider = mock(UGIProvider.class);
+
+
when(provider.createProxyUGI(any(String.class))).thenAnswer((Answer)
 invocation -> mock(UserGroupInformation.class));
+
+session = new SessionId(0, "txn-id", "the-user");
+
+cache = new UGICache(provider);
+}
+
+@Test
+public void getUGIFromEmptyCache() throws Exception {
+UGICacheEntry entry = cache.getTimedProxyUGI(session);
+assertNotNull(entry.getUGI());
+verify(provider).createProxyUGI("the-user");
--- End diff --

The ref count and expiration time are internal details of the cache, so 
rather than directly test them, we have other tests that simulate time passing, 
acquire multiple references etc. and assert that the UGIs are destroyed at the 
right time.

I believe the mock verification 
`verify(provider).createProxyUGI("the-user")` is as close as we can get to 
verifying the principals of the UGI, because we're using a mock UGI here.


---


[GitHub] incubator-hawq pull request #1379: HAWQ-1622. Cache PXF proxy UGI so that cl...

2018-07-11 Thread benchristel
Github user benchristel commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/1379#discussion_r201875131
  
--- Diff: 
pxf/pxf-service/src/main/java/org/apache/hawq/pxf/service/UGICacheEntry.java ---
@@ -0,0 +1,94 @@
+package org.apache.hawq.pxf.service;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.concurrent.Delayed;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.hadoop.security.UserGroupInformation;
+
+public class UGICacheEntry implements Delayed {
+
+private volatile long startTime;
+private UserGroupInformation proxyUGI;
+private SessionId session;
+private boolean cleaned = false;
+private AtomicInteger inProgress = new AtomicInteger();
+private static long UGI_CACHE_EXPIRY = 15 * 60 * 1000L; // 15 Minutes
--- End diff --

For testing? Or for some other reason?


---


[GitHub] incubator-hawq pull request #1379: HAWQ-1622. Cache PXF proxy UGI so that cl...

2018-07-11 Thread benchristel
Github user benchristel commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/1379#discussion_r201767807
  
--- Diff: 
pxf/pxf-service/src/test/java/org/apache/hawq/pxf/service/UGICacheTest.java ---
@@ -0,0 +1,197 @@
+package org.apache.hawq.pxf.service;
+
+import org.apache.hadoop.security.UserGroupInformation;
+import org.junit.Before;
+import org.junit.Test;
+import org.mockito.stubbing.Answer;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotEquals;
+import static org.junit.Assert.assertNotNull;
+import static org.mockito.Matchers.any;
+import static org.mockito.Mockito.*;
+import static org.powermock.api.mockito.PowerMockito.when;
+
+public class UGICacheTest {
+private UGIProvider provider = null;
+private SessionId session = null;
+private UGICache cache = null;
+
+@Before
+public void setUp() throws Exception {
+provider = mock(UGIProvider.class);
+
+
when(provider.createProxyUGI(any(String.class))).thenAnswer((Answer)
 invocation -> mock(UserGroupInformation.class));
+
+session = new SessionId(0, "txn-id", "the-user");
+
+cache = new UGICache(provider);
+}
+
+@Test
+public void getUGIFromEmptyCache() throws Exception {
+UGICacheEntry entry = cache.getTimedProxyUGI(session);
+assertNotNull(entry.getUGI());
+verify(provider).createProxyUGI("the-user");
+}
+
+@Test
+public void getSameUGITwiceUsesCache() throws Exception {
+UGICacheEntry entry1 = cache.getTimedProxyUGI(session);
+UGICacheEntry entry2 = cache.getTimedProxyUGI(session);
+assertEquals(entry1, entry2);
+assertNotNull(entry1.getUGI());
+verify(provider, times(1)).createProxyUGI("the-user");
+}
+
+@Test
+public void getTwoUGIsWithDifferentSessionsForSameUser() throws 
Exception {
+SessionId otherSession = new SessionId(0, "txn-id-2", "the-user");
+UGICacheEntry proxyUGI1 = cache.getTimedProxyUGI(session);
+UGICacheEntry proxyUGI2 = cache.getTimedProxyUGI(otherSession);
+assertNotEquals(proxyUGI1, proxyUGI2);
+verify(provider, times(2)).createProxyUGI("the-user");
+// TODO: this seems weird. We're creating two UGIs with the same 
params,
+// even though we have two different sessions. Why?
+}
+
+@Test
+public void getTwoUGIsWithDifferentUsers() throws Exception {
+SessionId otherSession = new SessionId(0, "txn-id", 
"different-user");
+UGICacheEntry proxyUGI1 = cache.getTimedProxyUGI(session);
+UGICacheEntry proxyUGI2 = cache.getTimedProxyUGI(otherSession);
+assertNotEquals(proxyUGI1, proxyUGI2);
+verify(provider, times(1)).createProxyUGI("the-user");
+verify(provider, times(1)).createProxyUGI("different-user");
+}
+
+@Test
+public void getTwoUGIsWithDifferentUsersCachesBoth() throws Exception {
+SessionId otherSession = new SessionId(0, "txn-id", 
"different-user");
+UGICacheEntry proxyUGI1a = cache.getTimedProxyUGI(session);
+UGICacheEntry proxyUGI1b = cache.getTimedProxyUGI(session);
+UGICacheEntry proxyUGI2a = cache.getTimedProxyUGI(otherSession);
+UGICacheEntry proxyUGI2b = cache.getTimedProxyUGI(otherSession);
+assertEquals(proxyUGI1a, proxyUGI1b);
+assertEquals(proxyUGI2a, proxyUGI2b);
+assertNotEquals(proxyUGI1a, proxyUGI2a);
+verify(provider, times(1)).createProxyUGI("the-user");
+verify(provider, times(1)).createProxyUGI("different-user");
+}
+
+@Test
+public void getUGIWhenRequestedUserDoesNotExist() throws Exception {
+// what does UserGroupInformation.createProxyUser() do in this 
scenario?
+// how about getLoginUser()?
+}
+
+@Test
+public void anySegmentIdIsValid() throws Exception {
+session = new SessionId(65, "txn-id", "the-user");
+UGICacheEntry proxyUGI1 = cache.getTimedProxyUGI(session);
+assertNotNull(proxyUGI1.getUGI());
+}
+
+@Test
+public void releaseWithoutForceClean() throws Exception {
+UGICacheEntry proxyUGI1 = cache.getTimedProxyUGI(session);
+
+cache.release(proxyUGI1, false);
+// UGI wasn't cleaned up, so we can still get it
+UGICacheEntry proxyUGI2 = cache.getTimedProxyUGI(session);
+assertEquals(proxyUGI1, proxyUGI2);
+verify(provider, times(1)).createProxyUGI("the-user");
+}
+
+@Test
+public void 

[jira] [Commented] (HAWQ-1638) Issues with website

2018-07-11 Thread Radar Lei (JIRA)


[ 
https://issues.apache.org/jira/browse/HAWQ-1638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539794#comment-16539794
 ] 

Radar Lei commented on HAWQ-1638:
-

Thanks [~s...@apache.org], we will continue to fix the issues you mentioned.

> Issues with website
> ---
>
> Key: HAWQ-1638
> URL: https://issues.apache.org/jira/browse/HAWQ-1638
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Sebb
>Assignee: Radar Lei
>Priority: Major
>
> The HAWQ page looks nice, however there are a few problems with it.
> The phrase
> "Plus, HAWQ® works Apache MADlib) machine learning libraries"
> does not read well. Something missing?
> The first/main references to ASF products such as Hadoop, YARN etc must use 
> the full name, i.e. Apache Hadoop etc.
> The download section does not have any link to the KEYS file, nor any 
> instructions on how to use the KEYS+sig or hashes to validate downloads.
> The download section still includes references to MD5 hashes.
> These are deprecated and can be removed for older releases that have other 
> hashes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1638) Issues with website

2018-07-11 Thread Sebb (JIRA)


[ 
https://issues.apache.org/jira/browse/HAWQ-1638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539759#comment-16539759
 ] 

Sebb commented on HAWQ-1638:


Thanks. Still some minor issues.

It's not obvious that HCatalog, Parquet, AVRO, HBase are all Apache software.

Downloaders won't necessarily see the section on verification (the page is very 
long).
So it would be good to put a link, e.g. add a line such as the following with 
each version
"You must _verify_ the downloaded files" (where _ means link to relevant 
section)

You could also drop the earlier versions and point to the archives instead.

% gpg --verify ${filename}.tar.gz.asc 
should really be
% gpg --verify ${filename}.tar.gz.asc  ${filename}.tar.gz

i.e. both the detached sig and the artifact itself should be specified.
See: https://www.apache.org/info/verification.html#CheckingSignatures


> Issues with website
> ---
>
> Key: HAWQ-1638
> URL: https://issues.apache.org/jira/browse/HAWQ-1638
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Sebb
>Assignee: Radar Lei
>Priority: Major
>
> The HAWQ page looks nice, however there are a few problems with it.
> The phrase
> "Plus, HAWQ® works Apache MADlib) machine learning libraries"
> does not read well. Something missing?
> The first/main references to ASF products such as Hadoop, YARN etc must use 
> the full name, i.e. Apache Hadoop etc.
> The download section does not have any link to the KEYS file, nor any 
> instructions on how to use the KEYS+sig or hashes to validate downloads.
> The download section still includes references to MD5 hashes.
> These are deprecated and can be removed for older releases that have other 
> hashes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1638) Issues with website

2018-07-11 Thread Radar Lei (JIRA)


[ 
https://issues.apache.org/jira/browse/HAWQ-1638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539737#comment-16539737
 ] 

Radar Lei commented on HAWQ-1638:
-

[~s...@apache.org], I have fixed the webpage as this ticket describes, please 
help to review, thanks.

[http://hawq.incubator.apache.org/]

[https://github.com/apache/incubator-hawq-site/pull/16]

[https://github.com/apache/incubator-hawq-site/pull/17]

 

> Issues with website
> ---
>
> Key: HAWQ-1638
> URL: https://issues.apache.org/jira/browse/HAWQ-1638
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Sebb
>Assignee: Radar Lei
>Priority: Major
>
> The HAWQ page looks nice, however there are a few problems with it.
> The phrase
> "Plus, HAWQ® works Apache MADlib) machine learning libraries"
> does not read well. Something missing?
> The first/main references to ASF products such as Hadoop, YARN etc must use 
> the full name, i.e. Apache Hadoop etc.
> The download section does not have any link to the KEYS file, nor any 
> instructions on how to use the KEYS+sig or hashes to validate downloads.
> The download section still includes references to MD5 hashes.
> These are deprecated and can be removed for older releases that have other 
> hashes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1638) Issues with website

2018-07-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HAWQ-1638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539647#comment-16539647
 ] 

ASF GitHub Bot commented on HAWQ-1638:
--

Github user jiny2 commented on the issue:

https://github.com/apache/incubator-hawq-site/pull/17
  
+1, LGTM. Thank you.


> Issues with website
> ---
>
> Key: HAWQ-1638
> URL: https://issues.apache.org/jira/browse/HAWQ-1638
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Sebb
>Assignee: Radar Lei
>Priority: Major
>
> The HAWQ page looks nice, however there are a few problems with it.
> The phrase
> "Plus, HAWQ® works Apache MADlib) machine learning libraries"
> does not read well. Something missing?
> The first/main references to ASF products such as Hadoop, YARN etc must use 
> the full name, i.e. Apache Hadoop etc.
> The download section does not have any link to the KEYS file, nor any 
> instructions on how to use the KEYS+sig or hashes to validate downloads.
> The download section still includes references to MD5 hashes.
> These are deprecated and can be removed for older releases that have other 
> hashes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] incubator-hawq-site issue #17: HAWQ-1638. Add how to verify downloaded files...

2018-07-11 Thread jiny2
Github user jiny2 commented on the issue:

https://github.com/apache/incubator-hawq-site/pull/17
  
+1, LGTM. Thank you.


---