from:"Dapeng Sun \(JIRA\)"

[jira] [Resolved] (HIVE-21483) Fix HoS when scratch_dir is using remote HDFS

2019-03-20 Thread Dapeng Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun resolved HIVE-21483.
---
Resolution: Duplicate

> Fix HoS when scratch_dir is using remote HDFS
> -
>
> Key: HIVE-21483
> URL: https://issues.apache.org/jira/browse/HIVE-21483
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Major
>
> HoS would fail when scratch dir is using remote HDFS:
> {noformat}
>   public static URI uploadToHDFS(URI source, HiveConf conf) throws 
> IOException {
> Path localFile = new Path(source.getPath());
> Path remoteFile = new 
> Path(SessionState.get().getSparkSession().getHDFSSessionDir(),
> getFileName(source));
> -FileSystem fileSystem = FileSystem.get(conf);
> +FileSystem fileSystem = remoteFile.getFileSystem(conf);
> // Overwrite if the remote file already exists. Whether the file can be 
> added
> // on executor is up to spark, i.e. spark.files.overwrite
> fileSystem.copyFromLocalFile(false, true, localFile, remoteFile);
> Path fullPath = fileSystem.getFileStatus(remoteFile).getPath();
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21483) Fix HoS when scratch_dir is using remote HDFS

2019-03-20 Thread Dapeng Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-21483:
--
Summary: Fix HoS when scratch_dir is using remote HDFS  (was: Fix HoS when 
scratch dir is using remote HDFS)

> Fix HoS when scratch_dir is using remote HDFS
> -
>
> Key: HIVE-21483
> URL: https://issues.apache.org/jira/browse/HIVE-21483
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Major
>
> HoS would fail when scratch dir is using remote HDFS:
> {noformat}
>   public static URI uploadToHDFS(URI source, HiveConf conf) throws 
> IOException {
> Path localFile = new Path(source.getPath());
> Path remoteFile = new 
> Path(SessionState.get().getSparkSession().getHDFSSessionDir(),
> getFileName(source));
> -FileSystem fileSystem = FileSystem.get(conf);
> +FileSystem fileSystem = remoteFile.getFileSystem(conf);
> // Overwrite if the remote file already exists. Whether the file can be 
> added
> // on executor is up to spark, i.e. spark.files.overwrite
> fileSystem.copyFromLocalFile(false, true, localFile, remoteFile);
> Path fullPath = fileSystem.getFileStatus(remoteFile).getPath();
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21483) Fix HoS when scratch dir is using remote HDFS

2019-03-20 Thread Dapeng Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-21483:
--
Summary: Fix HoS when scratch dir is using remote HDFS  (was: HoS would 
fail when scratch dir is using remote HDFS)

> Fix HoS when scratch dir is using remote HDFS
> -
>
> Key: HIVE-21483
> URL: https://issues.apache.org/jira/browse/HIVE-21483
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Major
>
> HoS would fail when scratch dir is using remote HDFS:
> {noformat}
>   public static URI uploadToHDFS(URI source, HiveConf conf) throws 
> IOException {
> Path localFile = new Path(source.getPath());
> Path remoteFile = new 
> Path(SessionState.get().getSparkSession().getHDFSSessionDir(),
> getFileName(source));
> -FileSystem fileSystem = FileSystem.get(conf);
> +FileSystem fileSystem = remoteFile.getFileSystem(conf);
> // Overwrite if the remote file already exists. Whether the file can be 
> added
> // on executor is up to spark, i.e. spark.files.overwrite
> fileSystem.copyFromLocalFile(false, true, localFile, remoteFile);
> Path fullPath = fileSystem.getFileStatus(remoteFile).getPath();
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-21483) HoS would fail when scratch dir is using remote HDFS

2019-03-20 Thread Dapeng Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun reassigned HIVE-21483:
-


> HoS would fail when scratch dir is using remote HDFS
> 
>
> Key: HIVE-21483
> URL: https://issues.apache.org/jira/browse/HIVE-21483
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Major
>
> HoS would fail when scratch dir is using remote HDFS:
>   public static URI uploadToHDFS(URI source, HiveConf conf) throws 
> IOException {
> Path localFile = new Path(source.getPath());
> Path remoteFile = new 
> Path(SessionState.get().getSparkSession().getHDFSSessionDir(),
> getFileName(source));
> -FileSystem fileSystem = FileSystem.get(conf);
> +FileSystem fileSystem = remoteFile.getFileSystem(conf);
> // Overwrite if the remote file already exists. Whether the file can be 
> added
> // on executor is up to spark, i.e. spark.files.overwrite
> fileSystem.copyFromLocalFile(false, true, localFile, remoteFile);
> Path fullPath = fileSystem.getFileStatus(remoteFile).getPath();
> r



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21483) HoS would fail when scratch dir is using remote HDFS

2019-03-20 Thread Dapeng Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-21483:
--
Description: 
HoS would fail when scratch dir is using remote HDFS:

{noformat}
  public static URI uploadToHDFS(URI source, HiveConf conf) throws IOException {
Path localFile = new Path(source.getPath());
Path remoteFile = new 
Path(SessionState.get().getSparkSession().getHDFSSessionDir(),
getFileName(source));
-FileSystem fileSystem = FileSystem.get(conf);
+FileSystem fileSystem = remoteFile.getFileSystem(conf);
// Overwrite if the remote file already exists. Whether the file can be 
added
// on executor is up to spark, i.e. spark.files.overwrite
fileSystem.copyFromLocalFile(false, true, localFile, remoteFile);
Path fullPath = fileSystem.getFileStatus(remoteFile).getPath();
{noformat}

  was:
HoS would fail when scratch dir is using remote HDFS:

  public static URI uploadToHDFS(URI source, HiveConf conf) throws IOException {
Path localFile = new Path(source.getPath());
Path remoteFile = new 
Path(SessionState.get().getSparkSession().getHDFSSessionDir(),
getFileName(source));
-FileSystem fileSystem = FileSystem.get(conf);
+FileSystem fileSystem = remoteFile.getFileSystem(conf);
// Overwrite if the remote file already exists. Whether the file can be 
added
// on executor is up to spark, i.e. spark.files.overwrite
fileSystem.copyFromLocalFile(false, true, localFile, remoteFile);
Path fullPath = fileSystem.getFileStatus(remoteFile).getPath();
r


> HoS would fail when scratch dir is using remote HDFS
> 
>
> Key: HIVE-21483
> URL: https://issues.apache.org/jira/browse/HIVE-21483
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Major
>
> HoS would fail when scratch dir is using remote HDFS:
> {noformat}
>   public static URI uploadToHDFS(URI source, HiveConf conf) throws 
> IOException {
> Path localFile = new Path(source.getPath());
> Path remoteFile = new 
> Path(SessionState.get().getSparkSession().getHDFSSessionDir(),
> getFileName(source));
> -FileSystem fileSystem = FileSystem.get(conf);
> +FileSystem fileSystem = remoteFile.getFileSystem(conf);
> // Overwrite if the remote file already exists. Whether the file can be 
> added
> // on executor is up to spark, i.e. spark.files.overwrite
> fileSystem.copyFromLocalFile(false, true, localFile, remoteFile);
> Path fullPath = fileSystem.getFileStatus(remoteFile).getPath();
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20063) Global limit concurrent connections of HiveServer2

2018-07-03 Thread Dapeng Sun (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530976#comment-16530976
 ] 

Dapeng Sun commented on HIVE-20063:
---

Thank [~prasanth_j] for your comments, yes, HIVE-16917 focus user/ip address 
level, for HiveServer2 service, having a global limit of concurrent connections 
is also meaningful.

> Global limit concurrent connections of HiveServer2
> --
>
> Key: HIVE-20063
> URL: https://issues.apache.org/jira/browse/HIVE-20063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Critical
> Fix For: 2.4.0, 3.1.0
>
>
> HS2 should have ability to config a global concurrent connections limit of 
> HiveServer2. it should reject to connect when reach the number of connections.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20063) Global limit concurrent connections of HiveServer2

2018-07-03 Thread Dapeng Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-20063:
--
Affects Version/s: 2.3.2

> Global limit concurrent connections of HiveServer2
> --
>
> Key: HIVE-20063
> URL: https://issues.apache.org/jira/browse/HIVE-20063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Critical
> Fix For: 2.4.0, 3.1.0
>
>
> HS2 should have ability to config a global concurrent connections limit of 
> HiveServer2. it should reject to connect when reach the number of connections.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20063) Global limit concurrent connections of HiveServer2

2018-07-03 Thread Dapeng Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-20063:
--
Affects Version/s: 3.0.0

> Global limit concurrent connections of HiveServer2
> --
>
> Key: HIVE-20063
> URL: https://issues.apache.org/jira/browse/HIVE-20063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Critical
> Fix For: 2.4.0, 3.1.0
>
>
> HS2 should have ability to config a global concurrent connections limit of 
> HiveServer2. it should reject to connect when reach the number of connections.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20063) Global limit concurrent connections of HiveServer2

2018-07-03 Thread Dapeng Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-20063:
--
Fix Version/s: 3.1.0
   2.4.0

> Global limit concurrent connections of HiveServer2
> --
>
> Key: HIVE-20063
> URL: https://issues.apache.org/jira/browse/HIVE-20063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Critical
> Fix For: 2.4.0, 3.1.0
>
>
> HS2 should have ability to config a global concurrent connections limit of 
> HiveServer2. it should reject to connect when reach the number of connections.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20063) Global limit concurrent connections of HiveServer2

2018-07-03 Thread Dapeng Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-20063:
--
Description: HS2 should have ability to config a global concurrent 
connections limit of HiveServer2. it should reject to connect when reach the 
number of connections.

> Global limit concurrent connections of HiveServer2
> --
>
> Key: HIVE-20063
> URL: https://issues.apache.org/jira/browse/HIVE-20063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Critical
>
> HS2 should have ability to config a global concurrent connections limit of 
> HiveServer2. it should reject to connect when reach the number of connections.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20063) Global limit concurrent connections of HiveServer2

2018-07-03 Thread Dapeng Sun (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun reassigned HIVE-20063:
-


> Global limit concurrent connections of HiveServer2
> --
>
> Key: HIVE-20063
> URL: https://issues.apache.org/jira/browse/HIVE-20063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-18 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210422#comment-16210422
 ] 

Dapeng Sun commented on HIVE-17823:
---

Thank [~vgarg] for your review.

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17823.001.patch
>
>
> The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 
> introduced subquery fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-17 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17823:
--
Description: The JIRA is targeted to fix the Qtest files failures of HoS 
due to HIVE-17726 introduced subquery fix.  (was: The JIRA is targeted to fix 
the Qtest files failures of HoS due to HIVE-17726 introduced subquery exist 
fix.)

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17823.001.patch
>
>
> The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 
> introduced subquery fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-17 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207075#comment-16207075
 ] 

Dapeng Sun commented on HIVE-17823:
---

Attached the patch.

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17823.001.patch
>
>
> The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 
> introduced subquery exist fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-17 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17823:
--
Attachment: HIVE-17823.001.patch

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17823.001.patch
>
>
> The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 
> introduced subquery exist fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-17 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17823:
--
Status: Patch Available  (was: Open)

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>
> The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 
> introduced subquery exist fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-16 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun reassigned HIVE-17823:
-

Assignee: Dapeng Sun

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>
> The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 
> introduced subquery exist fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-16 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17823:
--
Description: The JIRA is targeted to fix the Qtest files failures of HoS 
due to HIVE-17726 introduced subquery exist fix.

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>
> The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 
> introduced subquery exist fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-16 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17823:
--
Affects Version/s: 3.0.0

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-17756) Enable subquery related Qtests for Hive on Spark

2017-10-16 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun resolved HIVE-17756.
---
Resolution: Fixed

Thank all for the comments. I have opened a new JIRA HIVE-17823 to fix it.

> Enable subquery related Qtests for Hive on Spark
> 
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 3.0.0
>
> Attachments: HIVE-17756.001.patch
>
>
> HIVE-15456 and HIVE-15192 using Calsite to decorrelate and plan subqueries. 
> This JIRA is to indroduce subquery test and verify the subqueries plan for 
> Hive on Spark



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17756) Enable subquery related Qtests for Hive on Spark

2017-10-13 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204412#comment-16204412
 ] 

Dapeng Sun commented on HIVE-17756:
---

Thank Xuefu for your review and comments.

> Enable subquery related Qtests for Hive on Spark
> 
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 3.0.0
>
> Attachments: HIVE-17756.001.patch
>
>
> HIVE-15456 and HIVE-15192 using Calsite to decorrelate and plan subqueries. 
> This JIRA is to indroduce subquery test and verify the subqueries plan for 
> Hive on Spark



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17756) Introduce subquery test case for Hive on Spark

2017-10-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17756:
--
Description: HIVE-15456 and HIVE-15192 using Calsite to decorrelate and 
plan subqueries. This JIRA is to indroduce subquery test and verify the 
subqueries plan for Hive on Spark  (was: HIVE-15456 and HIVE-15192 using 
Calsite to decorrelate and plan subqueries. This JIRA is to indroduce and 
verify the subqueries plan for Hive on Spark)

> Introduce subquery test case for Hive on Spark
> --
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17756.001.patch
>
>
> HIVE-15456 and HIVE-15192 using Calsite to decorrelate and plan subqueries. 
> This JIRA is to indroduce subquery test and verify the subqueries plan for 
> Hive on Spark



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17756) Introduce subquery test case for Hive on Spark

2017-10-10 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199664#comment-16199664
 ] 

Dapeng Sun edited comment on HIVE-17756 at 10/11/17 1:19 AM:
-

The subquery plan of HoS looks good in these test cases.

[~xuefuz], [~Ferd] Do you have any comments?


was (Author: dapengsun):
The subquery plan of HoS looks good in these test cases. [~xuefuz] [~Ferd] Do 
you have any comments?

> Introduce subquery test case for Hive on Spark
> --
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17756.001.patch
>
>
> HIVE-15456 and HIVE-15192 using Calsite to decorrelate and plan subqueries. 
> This JIRA is to indroduce and verify the subqueries plan for Hive on Spark



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17756) Introduce subquery test case for Hive on Spark

2017-10-10 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199664#comment-16199664
 ] 

Dapeng Sun commented on HIVE-17756:
---

The subquery plan of HoS looks good in these test cases. [~xuefuz] [~Ferd] Do 
you have any comments?

> Introduce subquery test case for Hive on Spark
> --
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17756.001.patch
>
>
> HIVE-15456 and HIVE-15192 using Calsite to decorrelate and plan subqueries. 
> This JIRA is to indroduce and verify the subqueries plan for Hive on Spark



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17756) Introduce subquery test case for Hive on Spark

2017-10-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17756:
--
Description: HIVE-15456 and HIVE-15192 using Calsite to decorrelate and 
plan subqueries. This JIRA is to indroduce and verify the subqueries plan for 
Hive on Spark

> Introduce subquery test case for Hive on Spark
> --
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17756.001.patch
>
>
> HIVE-15456 and HIVE-15192 using Calsite to decorrelate and plan subqueries. 
> This JIRA is to indroduce and verify the subqueries plan for Hive on Spark



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17756) Introduce subquery test case for Hive on Spark

2017-10-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17756:
--
Attachment: HIVE-17756.001.patch

Attached the patch

> Introduce subquery test case for Hive on Spark
> --
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17756.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17756) Introduce subquery test case for Hive on Spark

2017-10-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17756:
--
Status: Patch Available  (was: Open)

> Introduce subquery test case for Hive on Spark
> --
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17756) Introduce subquery test case for Hive on Spark

2017-10-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun reassigned HIVE-17756:
-


> Introduce subquery test case for Hive on Spark
> --
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17000) Upgrade Hive to PARQUET 1.9.0

2017-07-04 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073260#comment-16073260
 ] 

Dapeng Sun commented on HIVE-17000:
---

Thank [~Ferd] for your review. Comparing with HIVE-16935, I think the UT 
failures are unrelated.

> Upgrade Hive to PARQUET 1.9.0
> -
>
> Key: HIVE-17000
> URL: https://issues.apache.org/jira/browse/HIVE-17000
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17000.001.patch, HIVE-17000.002.patch
>
>
> Parquet 1.9.0 is released and added many new features, such as PARQUET-601
> Add support in Parquet to configure the encoding used by ValueWriters
> We should upgrade Parquet dependence to 1.9.0 and bring these optimizations 
> to Hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17000) Upgrade Hive to PARQUET 1.9.0

2017-07-02 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17000:
--
Attachment: HIVE-17000.002.patch

> Upgrade Hive to PARQUET 1.9.0
> -
>
> Key: HIVE-17000
> URL: https://issues.apache.org/jira/browse/HIVE-17000
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17000.001.patch, HIVE-17000.002.patch
>
>
> Parquet 1.9.0 is released and added many new features, such as PARQUET-601
> Add support in Parquet to configure the encoding used by ValueWriters
> We should upgrade Parquet dependence to 1.9.0 and bring these optimizations 
> to Hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17000) Upgrade Hive to PARQUET 1.9.0

2017-06-29 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069426#comment-16069426
 ] 

Dapeng Sun commented on HIVE-17000:
---

Submit a simple patch to upgrade the Parquet dependence.

> Upgrade Hive to PARQUET 1.9.0
> -
>
> Key: HIVE-17000
> URL: https://issues.apache.org/jira/browse/HIVE-17000
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17000.001.patch
>
>
> Parquet 1.9.0 is released and added many new features, such as PARQUET-601
> Add support in Parquet to configure the encoding used by ValueWriters
> We should upgrade Parquet dependence to 1.9.0 and bring these optimizations 
> to Hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17000) Upgrade Hive to PARQUET 1.9.0

2017-06-29 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17000:
--
Attachment: HIVE-17000.001.patch

> Upgrade Hive to PARQUET 1.9.0
> -
>
> Key: HIVE-17000
> URL: https://issues.apache.org/jira/browse/HIVE-17000
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17000.001.patch
>
>
> Parquet 1.9.0 is released and added many new features, such as PARQUET-601
> Add support in Parquet to configure the encoding used by ValueWriters
> We should upgrade Parquet dependence to 1.9.0 and bring these optimizations 
> to Hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17000) Upgrade Hive to PARQUET 1.9.0

2017-06-29 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-17000:
--
Status: Patch Available  (was: Open)

> Upgrade Hive to PARQUET 1.9.0
> -
>
> Key: HIVE-17000
> URL: https://issues.apache.org/jira/browse/HIVE-17000
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>
> Parquet 1.9.0 is released and added many new features, such as PARQUET-601
> Add support in Parquet to configure the encoding used by ValueWriters
> We should upgrade Parquet dependence to 1.9.0 and bring these optimizations 
> to Hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17000) Upgrade Hive to PARQUET 1.9.0

2017-06-29 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun reassigned HIVE-17000:
-


> Upgrade Hive to PARQUET 1.9.0
> -
>
> Key: HIVE-17000
> URL: https://issues.apache.org/jira/browse/HIVE-17000
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>
> Parquet 1.9.0 is released and added many new features, such as PARQUET-601
> Add support in Parquet to configure the encoding used by ValueWriters
> We should upgrade Parquet dependence to 1.9.0 and bring these optimizations 
> to Hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15682) Eliminate per-row based dummy iterator creation

2017-02-13 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864843#comment-15864843
 ] 

Dapeng Sun commented on HIVE-15682:
---

Hi, Xuefu,

Okay, I will run these queries, but my cluster is using by another colleague, I 
will reply you when I get the cluster.

> Eliminate per-row based dummy iterator creation
> ---
>
> Key: HIVE-15682
> URL: https://issues.apache.org/jira/browse/HIVE-15682
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.2.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Fix For: 2.2.0
>
> Attachments: HIVE-15682.patch
>
>
> HIVE-15580 introduced a dummy iterator per input row which can be eliminated. 
> This is because {{SparkReduceRecordHandler}} is able to handle single key 
> value pairs. We can refactor this part of code 1. to remove the need for a 
> iterator and 2. to optimize the code path for per (key, value) based (instead 
> of (key, value iterator)) processing. It would be also great if we can 
> measure the performance after the optimizations and compare to performance 
> prior to HIVE-15580.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-15682) Eliminate per-row based dummy iterator creation

2017-02-13 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863356#comment-15863356
 ] 

Dapeng Sun edited comment on HIVE-15682 at 2/13/17 9:03 AM:


Hi Xuefu, sorry, I didn't receive the email notification of this JIRA. 

Do you mean you want to test {{order by}} likes {{`select count(*) from (select 
request_lat from dwh.fact_trip where datestr > '2017-01-27' order by 
request_lat) x;`}} about {{w/ HIVE-15580}}, {{w/o HIVE-15580}} at my 
environment?


was (Author: dapengsun):
Hi Xuefu, sorry, I didn't receive the email notification of this JIRA. 

Do you mean you want to test {{order by}} likes {{`select count(*) from (select 
request_lat from dwh.fact_trip where datestr > '2017-01-27' order by 
request_lat) x;`}} about {{ w/ HIVE-15580 }}, {{ w/o HIVE-15580 }} at my 
environment?

> Eliminate per-row based dummy iterator creation
> ---
>
> Key: HIVE-15682
> URL: https://issues.apache.org/jira/browse/HIVE-15682
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.2.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Fix For: 2.2.0
>
> Attachments: HIVE-15682.patch
>
>
> HIVE-15580 introduced a dummy iterator per input row which can be eliminated. 
> This is because {{SparkReduceRecordHandler}} is able to handle single key 
> value pairs. We can refactor this part of code 1. to remove the need for a 
> iterator and 2. to optimize the code path for per (key, value) based (instead 
> of (key, value iterator)) processing. It would be also great if we can 
> measure the performance after the optimizations and compare to performance 
> prior to HIVE-15580.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15682) Eliminate per-row based dummy iterator creation

2017-02-13 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863356#comment-15863356
 ] 

Dapeng Sun commented on HIVE-15682:
---

Hi Xuefu, sorry, I didn't receive the email notification of this JIRA. 

Do you mean you want to test {{order by}} likes {{`select count(*) from (select 
request_lat from dwh.fact_trip where datestr > '2017-01-27' order by 
request_lat) x;`}} about {{ w/ HIVE-15580 }}, {{ w/o HIVE-15580 }} at my 
environment?

> Eliminate per-row based dummy iterator creation
> ---
>
> Key: HIVE-15682
> URL: https://issues.apache.org/jira/browse/HIVE-15682
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.2.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Fix For: 2.2.0
>
> Attachments: HIVE-15682.patch
>
>
> HIVE-15580 introduced a dummy iterator per input row which can be eliminated. 
> This is because {{SparkReduceRecordHandler}} is able to handle single key 
> value pairs. We can refactor this part of code 1. to remove the need for a 
> iterator and 2. to optimize the code path for per (key, value) based (instead 
> of (key, value iterator)) processing. It would be also great if we can 
> measure the performance after the optimizations and compare to performance 
> prior to HIVE-15580.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15682) Eliminate per-row based dummy iterator creation

2017-02-12 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863124#comment-15863124
 ] 

Dapeng Sun commented on HIVE-15682:
---

Hi [~xuefuz], here is the 1TB TPC-BB result of HIVE-15580, without_HIVE-15580, 
HIVE-15682

https://docs.google.com/spreadsheets/d/1fJ8KFAJrPuLR4XDQNhTOCdHOIqouHKEjd_bFx_Gcu_A

> Eliminate per-row based dummy iterator creation
> ---
>
> Key: HIVE-15682
> URL: https://issues.apache.org/jira/browse/HIVE-15682
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.2.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Fix For: 2.2.0
>
> Attachments: HIVE-15682.patch
>
>
> HIVE-15580 introduced a dummy iterator per input row which can be eliminated. 
> This is because {{SparkReduceRecordHandler}} is able to handle single key 
> value pairs. We can refactor this part of code 1. to remove the need for a 
> iterator and 2. to optimize the code path for per (key, value) based (instead 
> of (key, value iterator)) processing. It would be also great if we can 
> measure the performance after the optimizations and compare to performance 
> prior to HIVE-15580.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15682) Eliminate per-row based dummy iterator creation

2017-02-07 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857457#comment-15857457
 ] 

Dapeng Sun commented on HIVE-15682:
---

Hi [~xuefuz], I will use TPCx-BB to run 1TB test about HIVE-15580,  HIVE-15682 
and no patched package, I would attach the result when I get it.

> Eliminate per-row based dummy iterator creation
> ---
>
> Key: HIVE-15682
> URL: https://issues.apache.org/jira/browse/HIVE-15682
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.2.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Fix For: 2.2.0
>
> Attachments: HIVE-15682.patch
>
>
> HIVE-15580 introduced a dummy iterator per input row which can be eliminated. 
> This is because {{SparkReduceRecordHandler}} is able to handle single key 
> value pairs. We can refactor this part of code 1. to remove the need for a 
> iterator and 2. to optimize the code path for per (key, value) based (instead 
> of (key, value iterator)) processing. It would be also great if we can 
> measure the performance after the optimizations and compare to performance 
> prior to HIVE-15580.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-18 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829218#comment-15829218
 ] 

Dapeng Sun commented on HIVE-15580:
---

Thank [~xuefuz] for the suggestion, currently the heap size is 290G for each 
executor, I will try to do more turning on it.

> Replace Spark's groupByKey operator with something with bounded memory
> --
>
> Key: HIVE-15580
> URL: https://issues.apache.org/jira/browse/HIVE-15580
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-15580.1.patch, HIVE-15580.1.patch, 
> HIVE-15580.2.patch, HIVE-15580.2.patch, HIVE-15580.3.patch, HIVE-15580.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-18 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829154#comment-15829154
 ] 

Dapeng Sun edited comment on HIVE-15580 at 1/19/17 2:05 AM:


Thank [~xuefuz], [~csun] and [~Ferd], we are running a 100TB test case about 
data skew on 50 nodes([TPC-xBB 
q21|https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench/tree/master/engines/hive/queries/q21]),
 before the patch, spark tasks are failed with following error:
{noformat}
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
 at java.util.Arrays.copyOf(Arrays.java:3181)
 at java.util.ArrayList.grow(ArrayList.java:261)
 at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:235)
 at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:227)
 at java.util.ArrayList.add(ArrayList.java:458)
 at 
org.apache.hadoop.hive.ql.exec.spark.SortByShuffler$ShuffleFunction$1.next(SortByShuffler.java:100)
 at 
org.apache.hadoop.hive.ql.exec.spark.SortByShuffler$ShuffleFunction$1.next(SortByShuffler.java:75)
 at 
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
 at 
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
 at 
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
 at 
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
 at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
 at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
 at org.apache.spark.scheduler.Task.run(Task.scala:89)
 at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
{noformat}
after apply the patches (patched HIVE-15580 and HIVE-15527 respectively), the 
arraylist are both fixed, but PartitionedPairBuffer at Spark side also cause 
OOM, here are the task failed exception:
{noformat}
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at 
org.apache.spark.util.collection.PartitionedPairBuffer.growArray(PartitionedPairBuffer.scala:67)
at 
org.apache.spark.util.collection.PartitionedPairBuffer.insert(PartitionedPairBuffer.scala:48)
at 
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:203)
at 
org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:111)
at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:98)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}


was (Author: dapengsun):
Thank [~xuefuz], [~csun] and [~Ferd], we are running a 100TB test case about 
data skew on 50 nodes([TPC-xBB 
q21|https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench/tree/master/engines/hive/queries/q21]),
 before the patch, spark tasks are failed with following error:
{noformat}
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
 at java.util.Arrays.copyOf(Arrays.java:3181)
 at java.util.ArrayList.grow(ArrayList.java:261)
 at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:235)
 at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:227)
 at java.util.ArrayList.add(ArrayList.java:458)
 at 
org.apache.hadoop.hive.ql.exec.spark.SortByShuffler$ShuffleFunction$1.next(SortByShuffler.java:100)
 at 
org.apache.hadoop.hive.ql.exec.spark.SortByShuffler$ShuffleFunction$1.next(SortByShuffler.java:75)

[jira] [Comment Edited] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-18 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829154#comment-15829154
 ] 

Dapeng Sun edited comment on HIVE-15580 at 1/19/17 2:01 AM:


Thank [~xuefuz], [~csun] and [~Ferd], we are running a 100TB test case about 
data skew on 50 nodes([TPC-xBB 
q21|https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench/tree/master/engines/hive/queries/q21]),
 before the patch, spark tasks are failed with following error:
{noformat}
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
 at java.util.Arrays.copyOf(Arrays.java:3181)
 at java.util.ArrayList.grow(ArrayList.java:261)
 at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:235)
 at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:227)
 at java.util.ArrayList.add(ArrayList.java:458)
 at 
org.apache.hadoop.hive.ql.exec.spark.SortByShuffler$ShuffleFunction$1.next(SortByShuffler.java:100)
 at 
org.apache.hadoop.hive.ql.exec.spark.SortByShuffler$ShuffleFunction$1.next(SortByShuffler.java:75)
 at 
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
 at 
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
 at 
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
 at 
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
 at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
 at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
 at org.apache.spark.scheduler.Task.run(Task.scala:89)
 at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
{noformat}
after apply the patch, the arraylist are fixed, but PartitionedPairBuffer also 
cause OOM, here are the task failed exception:
{noformat}
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at 
org.apache.spark.util.collection.PartitionedPairBuffer.growArray(PartitionedPairBuffer.scala:67)
at 
org.apache.spark.util.collection.PartitionedPairBuffer.insert(PartitionedPairBuffer.scala:48)
at 
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:203)
at 
org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:111)
at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:98)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}


was (Author: dapengsun):
Thank [~xuefuz], [~csun] and [~Ferd], we are running a 100TB test case about 
data skew on 50 nodes(TPC-xBB q21), before the patch, spark tasks are failed 
with following error:
{noformat}
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
 at java.util.Arrays.copyOf(Arrays.java:3181)
 at java.util.ArrayList.grow(ArrayList.java:261)
 at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:235)
 at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:227)
 at java.util.ArrayList.add(ArrayList.java:458)
 at 
org.apache.hadoop.hive.ql.exec.spark.SortByShuffler$ShuffleFunction$1.next(SortByShuffler.java:100)
 at 
org.apache.hadoop.hive.ql.exec.spark.SortByShuffler$ShuffleFunction$1.next(SortByShuffler.java:75)
 at 
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
 at

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-18 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829154#comment-15829154
 ] 

Dapeng Sun commented on HIVE-15580:
---

Thank [~xuefuz], [~csun] and [~Ferd], we are running a 100TB test case about 
data skew on 50 nodes(TPC-xBB q21), before the patch, spark tasks are failed 
with following error:
{noformat}
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
 at java.util.Arrays.copyOf(Arrays.java:3181)
 at java.util.ArrayList.grow(ArrayList.java:261)
 at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:235)
 at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:227)
 at java.util.ArrayList.add(ArrayList.java:458)
 at 
org.apache.hadoop.hive.ql.exec.spark.SortByShuffler$ShuffleFunction$1.next(SortByShuffler.java:100)
 at 
org.apache.hadoop.hive.ql.exec.spark.SortByShuffler$ShuffleFunction$1.next(SortByShuffler.java:75)
 at 
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
 at 
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
 at 
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
 at 
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
 at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
 at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
 at org.apache.spark.scheduler.Task.run(Task.scala:89)
 at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
{noformat}
after apply the patch, the arraylist are fixed, but PartitionedPairBuffer also 
cause OOM, here are the task failed exception:
{noformat}
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at 
org.apache.spark.util.collection.PartitionedPairBuffer.growArray(PartitionedPairBuffer.scala:67)
at 
org.apache.spark.util.collection.PartitionedPairBuffer.insert(PartitionedPairBuffer.scala:48)
at 
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:203)
at 
org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:111)
at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:98)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}

> Replace Spark's groupByKey operator with something with bounded memory
> --
>
> Key: HIVE-15580
> URL: https://issues.apache.org/jira/browse/HIVE-15580
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-15580.1.patch, HIVE-15580.1.patch, 
> HIVE-15580.2.patch, HIVE-15580.2.patch, HIVE-15580.3.patch, HIVE-15580.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-15527) Memory usage is unbound in SortByShuffler for Spark

2017-01-17 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827376#comment-15827376
 ] 

Dapeng Sun edited comment on HIVE-15527 at 1/18/17 3:32 AM:


Thank [~csun] and [~Ferd], here is the detail log:
{noformat}
17/01/17 xx:xx:xx INFO client.RemoteDriver: Failed to run job 

java.lang.NumberFormatException: null
at java.lang.Long.parseLong(Long.java:552)
at java.lang.Long.parseLong(Long.java:631)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:202)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:141)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:109)
at 
org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:335)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:366)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17/01/17 xx:xx:xx INFO client.RemoteDriver: Shutting down remote driver.
{noformat}


was (Author: dapengsun):
Thank [~csun] and [~Ferd], here is the detail log:
17/01/17 xx:xx:xx INFO client.RemoteDriver: Failed to run job 

java.lang.NumberFormatException: null
at java.lang.Long.parseLong(Long.java:552)
at java.lang.Long.parseLong(Long.java:631)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:202)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:141)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:109)
at 
org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:335)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:366)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17/01/17 xx:xx:xx INFO client.RemoteDriver: Shutting down remote driver.


> Memory usage is unbound in SortByShuffler for Spark
> ---
>
> Key: HIVE-15527
> URL: https://issues.apache.org/jira/browse/HIVE-15527
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Chao Sun
> Attachments: HIVE-15527.0.patch, HIVE-15527.0.patch, 
> HIVE-15527.1.patch, HIVE-15527.2.patch, HIVE-15527.3.patch, 
> HIVE-15527.4.patch, HIVE-15527.5.patch, HIVE-15527.6.patch, 
> HIVE-15527.7.patch, HIVE-15527.8.patch, HIVE-15527.patch
>
>
> In SortByShuffler.java, an ArrayList is used to back the iterator for values 
> that have the same key in shuffled result produced by spark transformation 
> sortByKey. It's possible that memory can be exhausted because of a large key 
> group.
> {code}
> @Override
> public Tuple2 next() {
>   // TODO: implement this by accumulating rows with the same key 
> into a list.
>   // Note that this list needs to improved to prevent excessive 
> memory usage, but this
>   // can be done in later phase.
>   while (it.hasNext()) {
> Tuple2 pair = it.next();
> if (curKey != null && !curKey.equals(pair._1())) {
>   HiveKey key = curKey;
>   List values = curValues;
>   curKey = pair._1();
>   curValues = new ArrayList();
>   curValues.add(pair._2());
>   return new Tuple2(key, 
> values);
> }
> curKey = pair._1();
> curValues.add(pair._2());
>   }
>   if (curKey == null) {
> throw new NoSuchElementException();
>   }
>   // if we get here, this should be the last element we have
>   HiveKey key = curKey;
>   curKey = null;
>

[jira] [Commented] (HIVE-15527) Memory usage is unbound in SortByShuffler for Spark

2017-01-17 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827376#comment-15827376
 ] 

Dapeng Sun commented on HIVE-15527:
---

Thank [~csun] and [~Ferd], here is the detail log:
17/01/17 xx:xx:xx INFO client.RemoteDriver: Failed to run job 

java.lang.NumberFormatException: null
at java.lang.Long.parseLong(Long.java:552)
at java.lang.Long.parseLong(Long.java:631)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:202)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:141)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:109)
at 
org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:335)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:366)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17/01/17 xx:xx:xx INFO client.RemoteDriver: Shutting down remote driver.


> Memory usage is unbound in SortByShuffler for Spark
> ---
>
> Key: HIVE-15527
> URL: https://issues.apache.org/jira/browse/HIVE-15527
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Chao Sun
> Attachments: HIVE-15527.0.patch, HIVE-15527.0.patch, 
> HIVE-15527.1.patch, HIVE-15527.2.patch, HIVE-15527.3.patch, 
> HIVE-15527.4.patch, HIVE-15527.5.patch, HIVE-15527.6.patch, 
> HIVE-15527.7.patch, HIVE-15527.8.patch, HIVE-15527.patch
>
>
> In SortByShuffler.java, an ArrayList is used to back the iterator for values 
> that have the same key in shuffled result produced by spark transformation 
> sortByKey. It's possible that memory can be exhausted because of a large key 
> group.
> {code}
> @Override
> public Tuple2 next() {
>   // TODO: implement this by accumulating rows with the same key 
> into a list.
>   // Note that this list needs to improved to prevent excessive 
> memory usage, but this
>   // can be done in later phase.
>   while (it.hasNext()) {
> Tuple2 pair = it.next();
> if (curKey != null && !curKey.equals(pair._1())) {
>   HiveKey key = curKey;
>   List values = curValues;
>   curKey = pair._1();
>   curValues = new ArrayList();
>   curValues.add(pair._2());
>   return new Tuple2(key, 
> values);
> }
> curKey = pair._1();
> curValues.add(pair._2());
>   }
>   if (curKey == null) {
> throw new NoSuchElementException();
>   }
>   // if we get here, this should be the last element we have
>   HiveKey key = curKey;
>   curKey = null;
>   return new Tuple2(key, 
> curValues);
> }
> {code}
> Since the output from sortByKey is already sorted on key, it's possible to 
> backup the value iterable using the same input iterator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-13 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-14916:
--
Attachment: HIVE-14916.004.patch

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, 
> HIVE-14916.003.patch, HIVE-14916.004.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-12 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15568015#comment-15568015
 ] 

Dapeng Sun commented on HIVE-14916:
---

Thank [~sseth], I will try and update the patch.

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, 
> HIVE-14916.003.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-11 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567679#comment-15567679
 ] 

Dapeng Sun edited comment on HIVE-14916 at 10/12/16 5:40 AM:
-

[~sseth], do you mean the conf changes on {{MiniCluster}} is unnecessary?


was (Author: dapengsun):
[~sseth], do you mean the changes on {{MiniCluster}} is unnecessary?

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, 
> HIVE-14916.003.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-11 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567679#comment-15567679
 ] 

Dapeng Sun commented on HIVE-14916:
---

[~sseth], do you mean the changes on {{MiniCluster}} is unnecessary?

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, 
> HIVE-14916.003.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-11 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567323#comment-15567323
 ] 

Dapeng Sun edited comment on HIVE-14916 at 10/12/16 2:32 AM:
-

Thank [~sseth], in my local environment, tests are also passed after adding the 
following changes.
{noformat}
 conf.setInt(YarnConfiguration.YARN_MINICLUSTER_NM_PMEM_MB, 4096);
486   conf.setInt(YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB, 
1024);
487   conf.setInt(YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_MB, 
4096);
{noformat}

I attached a new patch {{HIVE-14916.003}} with these changes to trigger Jenkins.


was (Author: dapengsun):
Thank [~sseth], tests also passed after adding the following changes.
{noformat}
 conf.setInt(YarnConfiguration.YARN_MINICLUSTER_NM_PMEM_MB, 4096);
486   conf.setInt(YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB, 
1024);
487   conf.setInt(YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_MB, 
4096);
{noformat}

I attached a new patch {{HIVE-14916.003}} with these changes to trigger Jenkins.

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, 
> HIVE-14916.003.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-11 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567323#comment-15567323
 ] 

Dapeng Sun commented on HIVE-14916:
---

Thank [~sseth], tests also passed after adding the following changes.
{noformat}
 conf.setInt(YarnConfiguration.YARN_MINICLUSTER_NM_PMEM_MB, 4096);
486   conf.setInt(YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB, 
1024);
487   conf.setInt(YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_MB, 
4096);
{noformat}

I attached a new patch {{HIVE-14916.003}} with these changes to trigger Jenkins.

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, 
> HIVE-14916.003.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-11 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-14916:
--
Attachment: HIVE-14916.003.patch

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, 
> HIVE-14916.003.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-10 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562507#comment-15562507
 ] 

Dapeng Sun commented on HIVE-14916:
---

Thank Fred and Sergio for your review.

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-09 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-14916:
--
Attachment: HIVE-14916.002.patch

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-09 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-14916:
--
Status: Patch Available  (was: Open)

Uploaded an inital patch

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-09 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-14916:
--
Attachment: HIVE-14916.001.patch

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-09 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun reassigned HIVE-14916:
-

Assignee: Dapeng Sun

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14029) Update Spark version to 2.0.0

2016-09-20 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508353#comment-15508353
 ] 

Dapeng Sun edited comment on HIVE-14029 at 9/21/16 1:24 AM:


[~Ferd]
Yes, I used this command


was (Author: dapengsun):
Yes, I used this command

> Update Spark version to 2.0.0
> -
>
> Key: HIVE-14029
> URL: https://issues.apache.org/jira/browse/HIVE-14029
> Project: Hive
>  Issue Type: Bug
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-14029.1.patch, HIVE-14029.patch
>
>
> There are quite some new optimizations in Spark 2.0.0. We need to bump up 
> Spark to 2.0.0 to benefit those performance improvements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14029) Update Spark version to 2.0.0

2016-09-20 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508353#comment-15508353
 ] 

Dapeng Sun commented on HIVE-14029:
---

Yes, I used this command

> Update Spark version to 2.0.0
> -
>
> Key: HIVE-14029
> URL: https://issues.apache.org/jira/browse/HIVE-14029
> Project: Hive
>  Issue Type: Bug
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-14029.1.patch, HIVE-14029.patch
>
>
> There are quite some new optimizations in Spark 2.0.0. We need to bump up 
> Spark to 2.0.0 to benefit those performance improvements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12917) Document for Hive authorization V2

2016-08-12 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418212#comment-15418212
 ] 

Dapeng Sun commented on HIVE-12917:
---

Status update: Ke Jia is working on the document, the first version is under 
under review by few people, please feel free to send email to Ke if you want to 
review it. 

> Document for Hive authorization V2
> --
>
> Key: HIVE-12917
> URL: https://issues.apache.org/jira/browse/HIVE-12917
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Ke Jia
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (HIVE-12917) Document for Hive authorization V2

2016-08-11 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12917:
--
Comment: was deleted

(was: Status update: Ke Jia is working on the document, the inital version is 
under under review by few people, please feel free to send email to Ke if you 
want to review it. )

> Document for Hive authorization V2
> --
>
> Key: HIVE-12917
> URL: https://issues.apache.org/jira/browse/HIVE-12917
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Ke Jia
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-12917) Document for Hive authorization V2

2016-08-11 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418212#comment-15418212
 ] 

Dapeng Sun edited comment on HIVE-12917 at 8/12/16 1:09 AM:


Status update: Ke Jia is working on the document, the inital version is under 
under review by few people, please feel free to send email to Ke if you want to 
review it. 


was (Author: dapengsun):
Status update: Ke Jia is working on the document, the first version is under 
under review by few people, please feel free to send email to Ke if you want to 
review it. 

> Document for Hive authorization V2
> --
>
> Key: HIVE-12917
> URL: https://issues.apache.org/jira/browse/HIVE-12917
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Ke Jia
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (HIVE-12917) Document for Hive authorization V2

2016-08-11 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun reopened HIVE-12917:
---
  Assignee: Ke Jia  (was: Dapeng Sun)

> Document for Hive authorization V2
> --
>
> Key: HIVE-12917
> URL: https://issues.apache.org/jira/browse/HIVE-12917
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Ke Jia
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12917) Document for Hive authorization V2

2016-08-11 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12917:
--
Status: Patch Available  (was: Reopened)

> Document for Hive authorization V2
> --
>
> Key: HIVE-12917
> URL: https://issues.apache.org/jira/browse/HIVE-12917
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Ke Jia
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12495) Lock/unlock table should add database and table information to inputs and outputs of authz hook

2016-05-25 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301274#comment-15301274
 ] 

Dapeng Sun commented on HIVE-12495:
---

Thank [~jcamachorodriguez] , I have rebased the patch.

> Lock/unlock table should add database and table information to inputs and 
> outputs of authz hook
> ---
>
> Key: HIVE-12495
> URL: https://issues.apache.org/jira/browse/HIVE-12495
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12495.001.patch, HIVE-12495.002.patch
>
>
> According the discussion at HIVE-12367, the jira will target to fix inputs 
> and outputs for lock/unlock table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12495) Lock/unlock table should add database and table information to inputs and outputs of authz hook

2016-05-25 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12495:
--
Attachment: HIVE-12495.002.patch

> Lock/unlock table should add database and table information to inputs and 
> outputs of authz hook
> ---
>
> Key: HIVE-12495
> URL: https://issues.apache.org/jira/browse/HIVE-12495
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12495.001.patch, HIVE-12495.002.patch
>
>
> According the discussion at HIVE-12367, the jira will target to fix inputs 
> and outputs for lock/unlock table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13545) Add GLOBAL Type to Entity

2016-04-19 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-13545:
--
Attachment: HIVE-13545.001.patch

> Add GLOBAL Type to Entity
> -
>
> Key: HIVE-13545
> URL: https://issues.apache.org/jira/browse/HIVE-13545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-13545.001.patch
>
>
> {{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} don't have the 
> {{GLOBAL}} type, it should be matched with 
> {{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}}.
>  At the same time, we should enable the custom converting from Entity to 
> HivePrivilegeObject



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13545) Add GLOBAL Type to Entity

2016-04-19 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-13545:
--
Status: Patch Available  (was: Open)

> Add GLOBAL Type to Entity
> -
>
> Key: HIVE-13545
> URL: https://issues.apache.org/jira/browse/HIVE-13545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>
> {{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} don't have the 
> {{GLOBAL}} type, it should be matched with 
> {{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}}.
>  At the same time, we should enable the custom converting from Entity to 
> HivePrivilegeObject



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13545) Add GLOBAL Type to Entity

2016-04-19 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-13545:
--
Description: {{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} 
don't have the {{GLOBAL}} type, it should be matched with 
{{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}}.
 At the same time, we should enable the custom converting from Entity to 
HivePrivilegeObject  (was: 
{{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} don't have the 
{{GLOBAL}} type, it should be matched with 
{{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}})

> Add GLOBAL Type to Entity
> -
>
> Key: HIVE-13545
> URL: https://issues.apache.org/jira/browse/HIVE-13545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>
> {{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} don't have the 
> {{GLOBAL}} type, it should be matched with 
> {{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}}.
>  At the same time, we should enable the custom converting from Entity to 
> HivePrivilegeObject



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2016-03-30 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217672#comment-15217672
 ] 

Dapeng Sun commented on HIVE-12367:
---

Thank [~ashutoshc] for your review and commit.

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 2.1.0
>
> Attachments: HIVE-12367.001.patch, HIVE-12367.002.patch, 
> HIVE-12367.003.patch, HIVE-12367.004.patch, HIVE-12367.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-12917) Document for Hive authorization V2

2016-01-24 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun resolved HIVE-12917.
---
Resolution: Invalid

> Document for Hive authorization V2
> --
>
> Key: HIVE-12917
> URL: https://issues.apache.org/jira/browse/HIVE-12917
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer

2015-12-16 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061520#comment-15061520
 ] 

Dapeng Sun commented on HIVE-12698:
---

Sorry for the repeat messages, there are something wrong with my web brower

> Remove exposure to internal privilege and principal classes in HiveAuthorizer
> -
>
> Key: HIVE-12698
> URL: https://issues.apache.org/jira/browse/HIVE-12698
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12698.1.patch
>
>
> The changes in HIVE-11179 expose several internal classes to 
> HiveAuthorization implementations. These include PrivilegeObjectDesc, 
> PrivilegeDesc, PrincipalDesc and AuthorizationUtils.
> We should avoid exposing that to all Authorization implementations, but also 
> make the ability to customize the mapping of internal classes to the public 
> api classes possible for Apache Sentry (incubating).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer

2015-12-16 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061509#comment-15061509
 ] 

Dapeng Sun commented on HIVE-12698:
---

I'm agreed with you, we can also fix getHivePrincipal() at {{DDLTask}}

> Remove exposure to internal privilege and principal classes in HiveAuthorizer
> -
>
> Key: HIVE-12698
> URL: https://issues.apache.org/jira/browse/HIVE-12698
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12698.1.patch
>
>
> The changes in HIVE-11179 expose several internal classes to 
> HiveAuthorization implementations. These include PrivilegeObjectDesc, 
> PrivilegeDesc, PrincipalDesc and AuthorizationUtils.
> We should avoid exposing that to all Authorization implementations, but also 
> make the ability to customize the mapping of internal classes to the public 
> api classes possible for Apache Sentry (incubating).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer

2015-12-16 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061512#comment-15061512
 ] 

Dapeng Sun commented on HIVE-12698:
---

I'm agreed with you, we can also fix getHivePrincipal() at {{DDLTask}}

> Remove exposure to internal privilege and principal classes in HiveAuthorizer
> -
>
> Key: HIVE-12698
> URL: https://issues.apache.org/jira/browse/HIVE-12698
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12698.1.patch
>
>
> The changes in HIVE-11179 expose several internal classes to 
> HiveAuthorization implementations. These include PrivilegeObjectDesc, 
> PrivilegeDesc, PrincipalDesc and AuthorizationUtils.
> We should avoid exposing that to all Authorization implementations, but also 
> make the ability to customize the mapping of internal classes to the public 
> api classes possible for Apache Sentry (incubating).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer

2015-12-16 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061513#comment-15061513
 ] 

Dapeng Sun commented on HIVE-12698:
---

I'm agreed with you, we can also fix getHivePrincipal() at {{DDLTask}}

> Remove exposure to internal privilege and principal classes in HiveAuthorizer
> -
>
> Key: HIVE-12698
> URL: https://issues.apache.org/jira/browse/HIVE-12698
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12698.1.patch
>
>
> The changes in HIVE-11179 expose several internal classes to 
> HiveAuthorization implementations. These include PrivilegeObjectDesc, 
> PrivilegeDesc, PrincipalDesc and AuthorizationUtils.
> We should avoid exposing that to all Authorization implementations, but also 
> make the ability to customize the mapping of internal classes to the public 
> api classes possible for Apache Sentry (incubating).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer

2015-12-16 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061511#comment-15061511
 ] 

Dapeng Sun commented on HIVE-12698:
---

I'm agreed with you, we can also fix getHivePrincipal() at {{DDLTask}}

> Remove exposure to internal privilege and principal classes in HiveAuthorizer
> -
>
> Key: HIVE-12698
> URL: https://issues.apache.org/jira/browse/HIVE-12698
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12698.1.patch
>
>
> The changes in HIVE-11179 expose several internal classes to 
> HiveAuthorization implementations. These include PrivilegeObjectDesc, 
> PrivilegeDesc, PrincipalDesc and AuthorizationUtils.
> We should avoid exposing that to all Authorization implementations, but also 
> make the ability to customize the mapping of internal classes to the public 
> api classes possible for Apache Sentry (incubating).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer

2015-12-16 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061510#comment-15061510
 ] 

Dapeng Sun commented on HIVE-12698:
---

I'm agreed with you, we can also fix getHivePrincipal() at {{DDLTask}}

> Remove exposure to internal privilege and principal classes in HiveAuthorizer
> -
>
> Key: HIVE-12698
> URL: https://issues.apache.org/jira/browse/HIVE-12698
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12698.1.patch
>
>
> The changes in HIVE-11179 expose several internal classes to 
> HiveAuthorization implementations. These include PrivilegeObjectDesc, 
> PrivilegeDesc, PrincipalDesc and AuthorizationUtils.
> We should avoid exposing that to all Authorization implementations, but also 
> make the ability to customize the mapping of internal classes to the public 
> api classes possible for Apache Sentry (incubating).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11179) HIVE should allow custom converting from HivePrivilegeObjectDesc to privilegeObject for different authorizers

2015-12-16 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061343#comment-15061343
 ] 

Dapeng Sun commented on HIVE-11179:
---

Thank [~thejas] for your follow up. I will also think about how to minimize the 
api change.

> HIVE should allow custom converting from HivePrivilegeObjectDesc to 
> privilegeObject for different authorizers
> -
>
> Key: HIVE-11179
> URL: https://issues.apache.org/jira/browse/HIVE-11179
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>  Labels: Authorization
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11179.001.patch, HIVE-11179.001.patch
>
>
> HIVE should allow custom converting from HivePrivilegeObjectDesc to 
> privilegeObject for different authorizers:
> There is a case in Apache Sentry: Sentry support uri and server level 
> privilege, but in hive side, it uses 
> {{AuthorizationUtils.getHivePrivilegeObject(privSubjectDesc)}} to do the 
> converting, and the code in {{getHivePrivilegeObject()}} only handle the 
> scenes for table and database 
> {noformat}
> privSubjectDesc.getTable() ? HivePrivilegeObjectType.TABLE_OR_VIEW :
> HivePrivilegeObjectType.DATABASE;
> {noformat}
> A solution is move this method to {{HiveAuthorizer}}, so that a custom 
> Authorizer could enhance it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-12-15 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12367:
--
Attachment: HIVE-12367.004.patch

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12367.001.patch, HIVE-12367.002.patch, 
> HIVE-12367.003.patch, HIVE-12367.004.patch, HIVE-12367.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-12-15 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059574#comment-15059574
 ] 

Dapeng Sun commented on HIVE-12367:
---

Hi [~alangates], the failed cases (the same as 
https://issues.apache.org/jira/browse/HIVE-12675?focusedCommentId=15059400=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15059400
 ) are not related to my changes. Could you help me commit it if you don't have 
further questions? Thank you. :)

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12367.001.patch, HIVE-12367.002.patch, 
> HIVE-12367.003.patch, HIVE-12367.004.patch, HIVE-12367.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-12-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12367:
--
Attachment: HIVE-12367.004.patch

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12367.001.patch, HIVE-12367.002.patch, 
> HIVE-12367.003.patch, HIVE-12367.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-12-10 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052093#comment-15052093
 ] 

Dapeng Sun commented on HIVE-12367:
---

Update patch with master.

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12367.001.patch, HIVE-12367.002.patch, 
> HIVE-12367.003.patch, HIVE-12367.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-11-25 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028284#comment-15028284
 ] 

Dapeng Sun commented on HIVE-12367:
---

Thank [~alangates] for your review :)

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12367.001.patch, HIVE-12367.002.patch, 
> HIVE-12367.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12495) Lock/unlock table should add database and table information to inputs and outputs of authz hook

2015-11-22 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12495:
--
Target Version/s: 1.3.0, 2.0.0, 1.2.2
 Component/s: Authorization

> Lock/unlock table should add database and table information to inputs and 
> outputs of authz hook
> ---
>
> Key: HIVE-12495
> URL: https://issues.apache.org/jira/browse/HIVE-12495
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-11-22 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021688#comment-15021688
 ] 

Dapeng Sun commented on HIVE-12367:
---

Hi [~alangates], I have updated the comment of DDL_NO_LOCK, I also created 
HIVE-12495, it will also depend on the newly added method 
{{isExplicitLockOperation()}}, otherwise Driver will acquire shared lock for 
{{ReadEntity}}.

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12367.001.patch, HIVE-12367.002.patch, 
> HIVE-12367.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-11-22 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12367:
--
Attachment: HIVE-12367.003.patch

Updated the patch

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12367.001.patch, HIVE-12367.002.patch, 
> HIVE-12367.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12495) Lock/unlock table should add database and table information to inputs and outputs of authz hook

2015-11-22 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12495:
--
Affects Version/s: 1.2.1

> Lock/unlock table should add database and table information to inputs and 
> outputs of authz hook
> ---
>
> Key: HIVE-12495
> URL: https://issues.apache.org/jira/browse/HIVE-12495
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>
> According the discussion at HIVE-12367, the jira will target to fix inputs 
> and outputs for lock/unlock table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12495) Lock/unlock table should add database and table information to inputs and outputs of authz hook

2015-11-22 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12495:
--
Attachment: HIVE-12495.001.patch

> Lock/unlock table should add database and table information to inputs and 
> outputs of authz hook
> ---
>
> Key: HIVE-12495
> URL: https://issues.apache.org/jira/browse/HIVE-12495
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12495.001.patch
>
>
> According the discussion at HIVE-12367, the jira will target to fix inputs 
> and outputs for lock/unlock table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12495) Lock/unlock table should add database and table information to inputs and outputs of authz hook

2015-11-22 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12495:
--
Description: According the discussion at HIVE-12367, the jira will target 
to fix inputs and outputs for lock/unlock table.

> Lock/unlock table should add database and table information to inputs and 
> outputs of authz hook
> ---
>
> Key: HIVE-12495
> URL: https://issues.apache.org/jira/browse/HIVE-12495
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>
> According the discussion at HIVE-12367, the jira will target to fix inputs 
> and outputs for lock/unlock table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-11-20 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15015432#comment-15015432
 ] 

Dapeng Sun commented on HIVE-12367:
---

I think the lock should still be {{DDL_NO_LOCK}}, I misunderstand LOCK and 
DDL_LOCK...

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12367.001.patch, HIVE-12367.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-11-19 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15015083#comment-15015083
 ] 

Dapeng Sun commented on HIVE-12367:
---

Thank [~sershe] and [~alangates]. 
{quote}Do files need the out file update?{quote}
Thank you for pointing out it. I will update the out files. 
{quote}
A comment on why you are selecting DDL_NO_LOCK in the lock function would be 
helpful, since it's confusing. 
{quote}
How about using {{DDL_SHARED}} or {{DDL_EXCLUSIVE}} (depends on lock mode) at 
Lock and using {{DDL_NO_LOCK}} at Unlock

{quote}analyzeLockTable and analyzeUnlockTable suffer from the same problem. I 
don't know if you want to fix those as well.{quote}
Thank you for pointing out it. I'm willing to, I will update it at this ticket 
or file another jira to fix it, do you think it is okay?

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12367.001.patch, HIVE-12367.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-11-12 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12367:
--
Attachment: HIVE-12367.002.patch

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12367.001.patch, HIVE-12367.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-11-08 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12367:
--
Component/s: Authorization

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-12367.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-11-08 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12367:
--
Fix Version/s: 1.2.2
   2.0.0
   1.3.0

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-12367.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-11-08 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12367:
--
Affects Version/s: 1.2.1

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-12367.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-11-08 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996169#comment-14996169
 ] 

Dapeng Sun commented on HIVE-12367:
---

 [~leftylev], got it. Thank you for your information. :)

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12367.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11780) Add "set role none" support

2015-09-14 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744802#comment-14744802
 ] 

Dapeng Sun commented on HIVE-11780:
---

Thank [~leftylev] for your reminder, and thank [~Fred] for your doc.

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0, 1.2.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-11780.001.patch, HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11780) Add "set role none" support

2015-09-13 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-11780:
--
Attachment: HIVE-11780.001.patch

same patch for triggering jenkins

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0, 1.2.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-11780.001.patch, HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11780) Hive support "set role none"

2015-09-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-11780:
--
Attachment: HIVE-11780.001.patch

> Hive support "set role none"
> 
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-11780.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11780) "set role none" support

2015-09-10 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-11780:
--
Summary: "set role none" support  (was: Hive support "set role none")

> "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-11780.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 127 matches

Mail list logo