[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17299651#comment-17299651 ] ASF GitHub Bot commented on PARQUET-1992: - gszadovszky closed pull request #877: URL: https://github.com/apache/parquet-mr/pull/877 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17299650#comment-17299650 ] ASF GitHub Bot commented on PARQUET-1992: - gszadovszky commented on pull request #877: URL: https://github.com/apache/parquet-mr/pull/877#issuecomment-796827049 Closing this one in favor of #878. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17299648#comment-17299648 ] ASF GitHub Bot commented on PARQUET-1992: - gszadovszky merged pull request #878: URL: https://github.com/apache/parquet-mr/pull/878 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298694#comment-17298694 ] ASF GitHub Bot commented on PARQUET-1992: - gszadovszky commented on a change in pull request #878: URL: https://github.com/apache/parquet-mr/pull/878#discussion_r591264845 ## File path: .gitignore ## @@ -19,3 +19,4 @@ target/ .cache *~ mvn_install.log +parquet-hadoop/parquet-testing Review comment: Since you are using `target` now this change is unnecessary. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17297618#comment-17297618 ] ASF GitHub Bot commented on PARQUET-1992: - gszadovszky commented on pull request #877: URL: https://github.com/apache/parquet-mr/pull/877#issuecomment-792959931 Thanks a lot for taking the time to implement two separate solutions. I've reviewed the other PR already. Let me postpone/cancel the reviewing of this one until we all agree the other one is the preferred one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17297608#comment-17297608 ] ASF GitHub Bot commented on PARQUET-1992: - gszadovszky commented on a change in pull request #878: URL: https://github.com/apache/parquet-mr/pull/878#discussion_r589627173 ## File path: parquet-hadoop/src/test/java/org/apache/parquet/hadoop/TestEncryptionOptions.java ## @@ -299,13 +309,15 @@ public void testWriteReadEncryptedParquetFiles() throws IOException { testReadEncryptedParquetFiles(rootPath, DATA); } - @Test - public void testInteropReadEncryptedParquetFiles() throws IOException { + public void testInteropReadEncryptedParquetFiles(ErrorCollector errorCollector, OkHttpClient httpClient) throws IOException { Review comment: Please add some notes that this method is deliberately not annotated by `@Test` and used elsewhere. ## File path: parquet-hadoop/src/test/java/org/apache/parquet/hadoop/ITTestEncryptionOptions.java ## @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.parquet.hadoop; + +import org.junit.Test; +import org.junit.Rule; +import org.junit.rules.ErrorCollector; + +import okhttp3.OkHttpClient; + + +import java.io.IOException; + +/* + * This file continues the testing in TestEncryptionOptions. This test goals: + * 4) Perform interoperability tests with other (eg parquet-cpp) writers, by reading Review comment: The number `4` does not make too much sense here. ## File path: .gitignore ## @@ -19,3 +19,4 @@ target/ .cache *~ mvn_install.log +parquet-hadoop/parquet-testing Review comment: I would suggest using either a temporary directory outside of source tree or a place the files inside the `target` directory. `target` would have the benefit that it is not cleaned until explicitly invoked. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17297218#comment-17297218 ] Gabor Szadovszky commented on PARQUET-1992: --- [~mayaa], bq. Benefit - the regular dev flow of building and running unit tests won't require downloading files and connectivity to github bq. We already need to download a bunch of file from the internet (maven plugins and dependencies). So even the tarball does require downloading if we want to build/test. bq. If so, they could be run by maven-failsafe-plugin as part of the integration-test/verify phase and missing the interop files would not fail "mvn install" but only "mvn verify" bq. AFAIK the failsafe plugin is configured to be executed at {{mvn verify}} and as {{install}} depends on the phase {{verify}} it still would fail if the integration tests could not be executed. BTW, we already have an integration test: [FileEncodingsIT|https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/test/java/org/apache/parquet/encodings/FileEncodingsIT.java]. bq. 2. Should the files for interop tests be downloaded directly in the test or using submodules in a separate maven profile for integration-test or as part of an existing profile, e.g. ci-test? bq. I think there is another option by downloading the required files directly from maven. I am not sure which plugin is capable of this or if it is better than downloading from the test by java code but it is still an option. bq. Git submodules provides flows for handling downloaded file versions - specific to a commit or a branch. bq. A github download link can contain the hash of the changeset so capable of handling file versions. bq. Git submodules manages downloading files only when needed bq. This is not true in the current situation. We are invoking the {{git submodule update}} in the {{initialization}} phase of maven. So we are downloading the whole {{parquet-testing}} repo (of a specific changeset) at least once. bq. It is aligned with the integration tests in parquet-cpp (arrow) bq. How parquet-cpp solves the similar issue with the tarball? bq. The files can be used for additional interop tests of other features bq. I agree, this was my first idea I liked in git submodules. Meanwhile, I've started thinking about implementing interoperability tests and now I think such tests could be implemented in the {{parquet-testing}} repo as they do not require low level access to the {{parquet-mr}} classes like unit tests do. My fear about the git submodules is that the {{parquet-testing}} repo might grow big and AFAIK you cannot control which files/directory you would like to sync only the changeset. bq. The tarball still won't contain the interop files, so the integration tests will fail on it. bq. I think we should not add the parquet files into the source tarball in any way. bq. Anyway, both ways are acceptable, so I'll implement whatever sounds best to the community. bq. I currently agree with [~sha...@uber.com] about downloading the required files. Meanwhile I am curious about the parquet-cpp solution. bq. BTW, when investigating the profiles, it seems to me that there is an old reference to the "travis" maven profile mentioned in the .travis.yml file, though its new name is "ci-test". bq. That's a good catch! We'll fix it. > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296124#comment-17296124 ] Xinli Shang commented on PARQUET-1992: -- I think we shouldn't let it fail when developers run 'mvn package/install' or 'mvn verify' in any case if they don't make any changes. So I like the idea of downloading directly. I will review the code once it passes the build. > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296064#comment-17296064 ] ASF GitHub Bot commented on PARQUET-1992: - andersonm-ibm opened a new pull request #878: URL: https://github.com/apache/parquet-mr/pull/878 This PR is another option of solving the problem "Cannot build from tarball because of git submodules". 1. Encryption interop tests run separately from unit tests - in integration-test phase 2. Files for interop tests are downloaded manually, so git submodules are totally removed from the project - `mvn package/install` - doesn't run interop tests - `mvn verify` - run interop test and download files for interop from GitHub directly This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17294676#comment-17294676 ] Maya Anderson commented on PARQUET-1992: [~gszadovszky] , I would say that there are 2 questions here: 1. Should the interop tests run separately from unit tests? * Benefit - the regular dev flow of building and running unit tests won't require downloading files and connectivity to github * If so, they could be run by {{maven-failsafe-plugin}} as part of the integration-test/verify phase and missing the interop files would not fail "{{mvn install}}" but only "{{mvn verify}}" 2. Should the files for interop tests be downloaded directly in the test or using submodules in a separate maven profile for integration-test or as part of an existing profile, e.g. {{ci-test}}? I see the following advantages of the submodule approach: * Git submodules provides flows for handling downloaded file versions - specific to a commit or a branch. * Git submodules manages downloading files only when needed * It is aligned with the integration tests in parquet-cpp (arrow) * The files can be used for additional interop tests of other features Disadvantages: * The tarball still won't contain the interop files, so the integration tests will fail on it. However, if interop tests are separate from unit tests, then maybe it shouldn't be a problem? Anyway, both ways are acceptable, so I'll implement whatever sounds best to the community. BTW, when investigating the profiles, it seems to me that there is an old reference to the "travis" maven profile mentioned in the .travis.yml file, though its new name is "ci-test". > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293788#comment-17293788 ] Gabor Szadovszky commented on PARQUET-1992: --- [~mayaa], the {{integration-test}} you've referenced is a goal of the {{maven-failsafe-plugin}} not a profile reference. Anyway it is one option to not to execute these tests (and the related git module update) for the default profiles. Another way would be to download the required parquet files in a way that is working if you are not in a git repository (the extracted tarball). One easy way is to use the direct github links of the actual files (e.g. https://github.com/apache/parquet-testing/raw/40379b3/data/encrypt_columns_and_footer.parquet.encrypted). I think, downloading the files has some benefits over updating the whole github submodule. But I am open to discussions. > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293680#comment-17293680 ] Maya Anderson commented on PARQUET-1992: [~gszadovszky] and [~junjie] ,how about separating the integration tests into a separate integration-test profile, somewhat similarly to the one described in [https://www.petrikainulainen.net/programming/maven/integration-testing-with-maven/] ? I see that we already have a reference to this currently non-existing profile: https://github.com/apache/parquet-mr/blame/2c6ceb330bbfd282715730b478714b84418c0749/pom.xml#L411 > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293536#comment-17293536 ] Gabor Szadovszky commented on PARQUET-1992: --- Some more info about the tar file generation. It is generated by the script [dev/source-release.sh|https://github.com/apache/parquet-mr/blob/master/dev/source-release.sh#L57]. The command {{git archive}} is used. It seems that {{git archive}} does not care about the git modules. However, it is not necessarily a bad thing. Currently the whole repository of parquet-testing is cloned. This is not a great deal because currently it is 136K only. But we are planning to extend that repo and also we can never know when will someone upload files for testing something that is unrelated to parquet-mr. Also, the content of parquet-testing is not something we would like to include in our source tarball. As a summary we need a method for downloading the required parquet files in a way that is working from both the git repo (at development or from the CI) and from the unpacked source tarball. > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293509#comment-17293509 ] Gidon Gershinsky commented on PARQUET-1992: --- This contribution had been added by [~mayaa], she knows the subject better than me. Maya, could you address the comments and the question in this jira? > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293475#comment-17293475 ] Gabor Szadovszky commented on PARQUET-1992: --- That's a good idea, [~junjie]. Another option would be to download the required parquet files using github direct links. We are downloading our dependencies during the build anyway so downloading some files additionally required by the testing should be acceptable. > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293285#comment-17293285 ] Junjie Chen commented on PARQUET-1992: -- How about make the related tests required in Travis CI and optional in the maven build? > Cannot build from tarball because of git submodules > --- > > Key: PARQUET-1992 > URL: https://issues.apache.org/jira/browse/PARQUET-1992 > Project: Parquet > Issue Type: Bug >Reporter: Gabor Szadovszky >Priority: Blocker > > Because we use git submodules (to get test parquet files) a simple "mvn clean > install" fails from the unpacked tarball due to "not a git repository". > I think we would have 2 options to solve this situation: > * Include all the required files (even only for testing) in the tarball and > somehow avoid the git submodule update in case of executed in a non-git > envrionment > * Make the downloading of the parquet files and the related tests optional so > it won't fail the build from the tarball -- This message was sent by Atlassian Jira (v8.3.4#803005)