[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16503188#comment-16503188 ] Saisai Shao commented on HIVE-16391: Uploaded a new patch [^HIVE-16391.1.patch]to use the solution mentioned by Marcelo. Simply by adding two new maven modules and rename the original "hive-exec" module. One added module is new "hive-exec" which is compliant to existing Hive, another added module "hive-exec-spark" is specifically for Spark. > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Affects Versions: 1.2.2 >Reporter: Reynold Xin >Assignee: Saisai Shao >Priority: Major > Labels: pull-request-available > Fix For: 1.2.3 > > Attachments: HIVE-16391.1.patch, HIVE-16391.patch > > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated HIVE-16391: --- Attachment: HIVE-16391.1.patch > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Affects Versions: 1.2.2 >Reporter: Reynold Xin >Assignee: Saisai Shao >Priority: Major > Labels: pull-request-available > Fix For: 1.2.3 > > Attachments: HIVE-16391.1.patch, HIVE-16391.patch > > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511851#comment-16511851 ] Saisai Shao commented on HIVE-16391: I see. I can keep the "core" classifier and use another name. Will update the patch. [~owen.omalley] would you please help to review this patch, since you created a Spark JIRA, or can you please point someone in Hive community to help to review? Thanks a lot. > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Affects Versions: 1.2.2 >Reporter: Reynold Xin >Assignee: Saisai Shao >Priority: Major > Labels: pull-request-available > Fix For: 1.2.3 > > Attachments: HIVE-16391.1.patch, HIVE-16391.patch > > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated HIVE-16391: --- Attachment: HIVE-16391.2.patch > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Affects Versions: 1.2.2 >Reporter: Reynold Xin >Assignee: Saisai Shao >Priority: Major > Labels: pull-request-available > Fix For: 1.2.3 > > Attachments: HIVE-16391.1.patch, HIVE-16391.2.patch, HIVE-16391.patch > > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505607#comment-16505607 ] Saisai Shao commented on HIVE-16391: Any comment [~vanzin] [~ste...@apache.org]? > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Affects Versions: 1.2.2 >Reporter: Reynold Xin >Assignee: Saisai Shao >Priority: Major > Labels: pull-request-available > Fix For: 1.2.3 > > Attachments: HIVE-16391.1.patch, HIVE-16391.patch > > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501415#comment-16501415 ] Saisai Shao commented on HIVE-16391: I'm not sure if submitting a PR is a right way to review in Hive Community, waiting for the feedback. > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Reporter: Reynold Xin >Priority: Major > Labels: pull-request-available > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501285#comment-16501285 ] Saisai Shao commented on HIVE-16391: Hi [~joshrosen] I'm trying to make the hive changes as you mentioned above using the new classifier {{core-spark}}. I found one problem about release two shaded jars (one is hive-exec, another is hive-exec-core-spark). The published pom file is still reduced pom file, which is related to hive-exec, so when Spark using hive-exec-core-spark jar, it should explicitly declare all the transitive dependencies of hive-exec. I'm not sure if there's a way to publish two pom files mapping to two different shaded jars, or it is acceptable for Spark to explicitly declare all the transitive dependencies, like {{core}} classifier you used before? > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Reporter: Reynold Xin >Priority: Major > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501561#comment-16501561 ] Saisai Shao commented on HIVE-16391: Seems there's no permission for me to upload a file. > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Reporter: Reynold Xin >Priority: Major > Labels: pull-request-available > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501561#comment-16501561 ] Saisai Shao edited comment on HIVE-16391 at 6/5/18 10:15 AM: - Seems there's no permission for me to upload a file. There's no such button. was (Author: jerryshao): Seems there's no permission for me to upload a file. > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Reporter: Reynold Xin >Priority: Major > Labels: pull-request-available > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501667#comment-16501667 ] Saisai Shao commented on HIVE-16391: I see, thanks. Will upload the patch. > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Reporter: Reynold Xin >Assignee: Saisai Shao >Priority: Major > Labels: pull-request-available > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated HIVE-16391: --- Fix Version/s: 1.2.3 Affects Version/s: 1.2.2 Attachment: HIVE-16391.patch Status: Patch Available (was: Open) > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Affects Versions: 1.2.2 >Reporter: Reynold Xin >Assignee: Saisai Shao >Priority: Major > Labels: pull-request-available > Fix For: 1.2.3 > > Attachments: HIVE-16391.patch > > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502756#comment-16502756 ] Saisai Shao commented on HIVE-16391: {quote}The problem with that is that it changes the meaning of Hive's artifacts, so anybody currently importing hive-exec would see a breakage, and that's probably not desired. {quote} This might not be acceptable from Hive community, because it will break the current user as you mentioned. As [~joshrosen] mentioned, Spark wants the hive-exec jar which shades kryo and prototuf-java, not a pure non-shaded jar. {quote}Another option is to change the artifact name of the current "hive-exec" pom. Then you'd publish the normal jar under the new artifact name, then have a separate module that imports that jar, shades dependencies, and publishes the result as "hive-exec". That would maintain compatibility with existing artifacts. {quote} I can try this approach, but it seems not a small change for Hive, I'm not sure if Hive community will accept such approach (at least for branch 1.2). > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Affects Versions: 1.2.2 >Reporter: Reynold Xin >Assignee: Saisai Shao >Priority: Major > Labels: pull-request-available > Fix For: 1.2.3 > > Attachments: HIVE-16391.patch > > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502976#comment-16502976 ] Saisai Shao commented on HIVE-16391: [~vanzin] one problem about your proposed solution: hive-exec test jar is not valid anymore, because we changed the artifact name for the current "hive-exec" pom. This might affect the user who relies on this test jar. > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Affects Versions: 1.2.2 >Reporter: Reynold Xin >Assignee: Saisai Shao >Priority: Major > Labels: pull-request-available > Fix For: 1.2.3 > > Attachments: HIVE-16391.patch > > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519931#comment-16519931 ] Saisai Shao commented on HIVE-16391: Gently ping [~hagleitn], would you please help to review the current proposed patch and suggest the next step. Thanks a lot. > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Affects Versions: 1.2.2 >Reporter: Reynold Xin >Assignee: Saisai Shao >Priority: Major > Labels: pull-request-available > Fix For: 1.2.3 > > Attachments: HIVE-16391.1.patch, HIVE-16391.2.patch, HIVE-16391.patch > > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
[ https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385539#comment-16385539 ] Saisai Shao commented on HIVE-16391: Hi all, Do we have any progress on it? Spark currently uses forked Hive 1.2.1.spark2, which rejects the Hadoop version 3.0 support (SPARK-18673). We can patch forked Hive 1.2.1.spark2 to support Hadoop 3, but seems a proper solution is to maintain this in Hive as discussed (SPARK-20202) and make it fix in the Hive community. > Publish proper Hive 1.2 jars (without including all dependencies in uber jar) > - > > Key: HIVE-16391 > URL: https://issues.apache.org/jira/browse/HIVE-16391 > Project: Hive > Issue Type: Task > Components: Build Infrastructure >Reporter: Reynold Xin >Priority: Major > > Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the > only change in the fork is to work around the issue that Hive publishes only > two sets of jars: one set with no dependency declared, and another with all > the dependencies included in the published uber jar. That is to say, Hive > doesn't publish a set of jars with the proper dependencies declared. > There is general consensus on both sides that we should remove the forked > Hive. > The change in the forked version is recorded here > https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2 > Note that the fork in the past included other fixes but those have all become > unnecessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)