[jira] [Work logged] (GRIFFIN-352) Running measure fails with NoSuchMethodError
[ https://issues.apache.org/jira/browse/GRIFFIN-352?focusedWorklogId=791703=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791703 ] ASF GitHub Bot logged work on GRIFFIN-352: -- Author: ASF GitHub Bot Created on: 17/Jul/22 00:07 Start Date: 17/Jul/22 00:07 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on PR #601: URL: https://github.com/apache/griffin/pull/601#issuecomment-1186345010 Automated Message: We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the 'no-pr-activity' tag! Issue Time Tracking --- Worklog Id: (was: 791703) Time Spent: 0.5h (was: 20m) > Running measure fails with NoSuchMethodError > > > Key: GRIFFIN-352 > URL: https://issues.apache.org/jira/browse/GRIFFIN-352 > Project: Griffin > Issue Type: Bug > Components: Measure Module >Affects Versions: 0.6.0 > Environment: spark 2.4, Hive 3.1 >Reporter: Vijay Kiran >Assignee: William Guo >Priority: Major > Attachments: env_batch.json, env_streaming.json > > Time Spent: 0.5h > Remaining Estimate: 0h > > With 0.6.0 and 0.7.0-SNAPSHOT, running measure with console sink fails with > NSME - please see the log below > {code} > 20/11/17 17:21:23 INFO transform.SparkSqlTransformStep: main begin transform > step : > accu > | |---__missCount > | | |---__missRecords > | |---__totalCount > Exception in thread "main" java.lang.NoSuchMethodError: > com.google.common.util.concurrent.MoreExecutors.sameThreadExecutor()Lcom/google/common/util/concurrent/ListeningExecutorService; > at > org.apache.griffin.measure.utils.ThreadUtils$.(ThreadUtils.scala:35) > at > org.apache.griffin.measure.utils.ThreadUtils$.(ThreadUtils.scala) > at > org.apache.griffin.measure.step.transform.TransformStep$.(TransformStep.scala:118) > at > org.apache.griffin.measure.step.transform.TransformStep$.(TransformStep.scala) > at > org.apache.griffin.measure.step.transform.TransformStep$$anonfun$3.apply(TransformStep.scala:61) > at > org.apache.griffin.measure.step.transform.TransformStep$$anonfun$3.apply(TransformStep.scala:51) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.mutable.HashSet.foreach(HashSet.scala:78) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at > scala.collection.mutable.AbstractSet.scala$collection$SetLike$$super$map(Set.scala:46) > at scala.collection.SetLike$class.map(SetLike.scala:92) > at scala.collection.mutable.AbstractSet.map(Set.scala:46) > at > org.apache.griffin.measure.step.transform.TransformStep$class.execute(TransformStep.scala:51) > at > org.apache.griffin.measure.step.transform.SparkSqlTransformStep.execute(SparkSqlTransformStep.scala:28) > at > org.apache.griffin.measure.job.DQJob$$anonfun$execute$2.apply(DQJob.scala:29) > at > org.apache.griffin.measure.job.DQJob$$anonfun$execute$2.apply(DQJob.scala:29) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.List.foreach(List.scala:392) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.immutable.List.map(List.scala:296) > at org.apache.griffin.measure.job.DQJob.execute(DQJob.scala:29) > at > org.apache.griffin.measure.launch.batch.BatchDQApp$$anonfun$1.apply(BatchDQApp.scala:85) > at > org.apache.griffin.measure.launch.batch.BatchDQApp$$anonfun$1.apply(BatchDQApp.scala:64) > at > org.apache.griffin.measure.utils.CommonUtils$.timeThis(CommonUtils.scala:36) > at > org.apache.griffin.measure.launch.batch.BatchDQApp.run(BatchDQApp.scala:64) > at org.apache.griffin.measure.Application$.main(Application.scala:92) > at org.apache.griffin.measure.Application.main(Application.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) >
[jira] [Work logged] (GRIFFIN-352) Running measure fails with NoSuchMethodError
[ https://issues.apache.org/jira/browse/GRIFFIN-352?focusedWorklogId=791704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791704 ] ASF GitHub Bot logged work on GRIFFIN-352: -- Author: ASF GitHub Bot Created on: 17/Jul/22 00:07 Start Date: 17/Jul/22 00:07 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #601: [GRIFFIN-352] Resolve conflicts between Guava versions URL: https://github.com/apache/griffin/pull/601 Issue Time Tracking --- Worklog Id: (was: 791704) Time Spent: 40m (was: 0.5h) > Running measure fails with NoSuchMethodError > > > Key: GRIFFIN-352 > URL: https://issues.apache.org/jira/browse/GRIFFIN-352 > Project: Griffin > Issue Type: Bug > Components: Measure Module >Affects Versions: 0.6.0 > Environment: spark 2.4, Hive 3.1 >Reporter: Vijay Kiran >Assignee: William Guo >Priority: Major > Attachments: env_batch.json, env_streaming.json > > Time Spent: 40m > Remaining Estimate: 0h > > With 0.6.0 and 0.7.0-SNAPSHOT, running measure with console sink fails with > NSME - please see the log below > {code} > 20/11/17 17:21:23 INFO transform.SparkSqlTransformStep: main begin transform > step : > accu > | |---__missCount > | | |---__missRecords > | |---__totalCount > Exception in thread "main" java.lang.NoSuchMethodError: > com.google.common.util.concurrent.MoreExecutors.sameThreadExecutor()Lcom/google/common/util/concurrent/ListeningExecutorService; > at > org.apache.griffin.measure.utils.ThreadUtils$.(ThreadUtils.scala:35) > at > org.apache.griffin.measure.utils.ThreadUtils$.(ThreadUtils.scala) > at > org.apache.griffin.measure.step.transform.TransformStep$.(TransformStep.scala:118) > at > org.apache.griffin.measure.step.transform.TransformStep$.(TransformStep.scala) > at > org.apache.griffin.measure.step.transform.TransformStep$$anonfun$3.apply(TransformStep.scala:61) > at > org.apache.griffin.measure.step.transform.TransformStep$$anonfun$3.apply(TransformStep.scala:51) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.mutable.HashSet.foreach(HashSet.scala:78) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at > scala.collection.mutable.AbstractSet.scala$collection$SetLike$$super$map(Set.scala:46) > at scala.collection.SetLike$class.map(SetLike.scala:92) > at scala.collection.mutable.AbstractSet.map(Set.scala:46) > at > org.apache.griffin.measure.step.transform.TransformStep$class.execute(TransformStep.scala:51) > at > org.apache.griffin.measure.step.transform.SparkSqlTransformStep.execute(SparkSqlTransformStep.scala:28) > at > org.apache.griffin.measure.job.DQJob$$anonfun$execute$2.apply(DQJob.scala:29) > at > org.apache.griffin.measure.job.DQJob$$anonfun$execute$2.apply(DQJob.scala:29) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.List.foreach(List.scala:392) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.immutable.List.map(List.scala:296) > at org.apache.griffin.measure.job.DQJob.execute(DQJob.scala:29) > at > org.apache.griffin.measure.launch.batch.BatchDQApp$$anonfun$1.apply(BatchDQApp.scala:85) > at > org.apache.griffin.measure.launch.batch.BatchDQApp$$anonfun$1.apply(BatchDQApp.scala:64) > at > org.apache.griffin.measure.utils.CommonUtils$.timeThis(CommonUtils.scala:36) > at > org.apache.griffin.measure.launch.batch.BatchDQApp.run(BatchDQApp.scala:64) > at org.apache.griffin.measure.Application$.main(Application.scala:92) > at org.apache.griffin.measure.Application.main(Application.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:847) > at >
[jira] [Work logged] (GRIFFIN-352) Running measure fails with NoSuchMethodError
[ https://issues.apache.org/jira/browse/GRIFFIN-352?focusedWorklogId=787261=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-787261 ] ASF GitHub Bot logged work on GRIFFIN-352: -- Author: ASF GitHub Bot Created on: 02/Jul/22 00:04 Start Date: 02/Jul/22 00:04 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on PR #601: URL: https://github.com/apache/griffin/pull/601#issuecomment-1172791039 Automated Message: This PR is being labelled as stale and will be closed in next 15 days due to lack of activity. To avoid this push new commits or ask the committers for a review/ resolution. Issue Time Tracking --- Worklog Id: (was: 787261) Time Spent: 20m (was: 10m) > Running measure fails with NoSuchMethodError > > > Key: GRIFFIN-352 > URL: https://issues.apache.org/jira/browse/GRIFFIN-352 > Project: Griffin > Issue Type: Bug > Components: Measure Module >Affects Versions: 0.6.0 > Environment: spark 2.4, Hive 3.1 >Reporter: Vijay Kiran >Assignee: William Guo >Priority: Major > Attachments: env_batch.json, env_streaming.json > > Time Spent: 20m > Remaining Estimate: 0h > > With 0.6.0 and 0.7.0-SNAPSHOT, running measure with console sink fails with > NSME - please see the log below > {code} > 20/11/17 17:21:23 INFO transform.SparkSqlTransformStep: main begin transform > step : > accu > | |---__missCount > | | |---__missRecords > | |---__totalCount > Exception in thread "main" java.lang.NoSuchMethodError: > com.google.common.util.concurrent.MoreExecutors.sameThreadExecutor()Lcom/google/common/util/concurrent/ListeningExecutorService; > at > org.apache.griffin.measure.utils.ThreadUtils$.(ThreadUtils.scala:35) > at > org.apache.griffin.measure.utils.ThreadUtils$.(ThreadUtils.scala) > at > org.apache.griffin.measure.step.transform.TransformStep$.(TransformStep.scala:118) > at > org.apache.griffin.measure.step.transform.TransformStep$.(TransformStep.scala) > at > org.apache.griffin.measure.step.transform.TransformStep$$anonfun$3.apply(TransformStep.scala:61) > at > org.apache.griffin.measure.step.transform.TransformStep$$anonfun$3.apply(TransformStep.scala:51) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.mutable.HashSet.foreach(HashSet.scala:78) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at > scala.collection.mutable.AbstractSet.scala$collection$SetLike$$super$map(Set.scala:46) > at scala.collection.SetLike$class.map(SetLike.scala:92) > at scala.collection.mutable.AbstractSet.map(Set.scala:46) > at > org.apache.griffin.measure.step.transform.TransformStep$class.execute(TransformStep.scala:51) > at > org.apache.griffin.measure.step.transform.SparkSqlTransformStep.execute(SparkSqlTransformStep.scala:28) > at > org.apache.griffin.measure.job.DQJob$$anonfun$execute$2.apply(DQJob.scala:29) > at > org.apache.griffin.measure.job.DQJob$$anonfun$execute$2.apply(DQJob.scala:29) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.List.foreach(List.scala:392) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.immutable.List.map(List.scala:296) > at org.apache.griffin.measure.job.DQJob.execute(DQJob.scala:29) > at > org.apache.griffin.measure.launch.batch.BatchDQApp$$anonfun$1.apply(BatchDQApp.scala:85) > at > org.apache.griffin.measure.launch.batch.BatchDQApp$$anonfun$1.apply(BatchDQApp.scala:64) > at > org.apache.griffin.measure.utils.CommonUtils$.timeThis(CommonUtils.scala:36) > at > org.apache.griffin.measure.launch.batch.BatchDQApp.run(BatchDQApp.scala:64) > at org.apache.griffin.measure.Application$.main(Application.scala:92) > at org.apache.griffin.measure.Application.main(Application.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=749023=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-749023 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 29/Mar/22 00:03 Start Date: 29/Mar/22 00:03 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #599: URL: https://github.com/apache/griffin/pull/599#issuecomment-1081265665 Automated Message: We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the 'no-pr-activity' tag! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 749023) Remaining Estimate: 19h 50m (was: 20h) Time Spent: 4h 10m (was: 4h) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 4h 10m > Remaining Estimate: 19h 50m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_131] at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=749022=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-749022 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 29/Mar/22 00:03 Start Date: 29/Mar/22 00:03 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #599: URL: https://github.com/apache/griffin/pull/599 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 749022) Remaining Estimate: 20h (was: 20h 10m) Time Spent: 4h (was: 3h 50m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 4h > Remaining Estimate: 20h > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_131] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) > ~[?:1.8.0_131] at > org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154) > ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE] at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=740486=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-740486 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 13/Mar/22 00:04 Start Date: 13/Mar/22 00:04 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #599: URL: https://github.com/apache/griffin/pull/599#issuecomment-1065988107 Automated Message: This PR is being labelled as stale and will be closed in next 15 days due to lack of activity. To avoid this push new commits or ask the committers for a review/ resolution. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 740486) Remaining Estimate: 20h 10m (was: 20h 20m) Time Spent: 3h 50m (was: 3h 40m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 3h 50m > Remaining Estimate: 20h 10m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_131] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_131] at
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=724205=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-724205 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 10/Feb/22 00:59 Start Date: 10/Feb/22 00:59 Worklog Time Spent: 10m Work Description: wankunde commented on pull request #599: URL: https://github.com/apache/griffin/pull/599#issuecomment-1034371536 LGTM. @chitralverma WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 724205) Remaining Estimate: 20h 20m (was: 20.5h) Time Spent: 3h 40m (was: 3.5h) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 3h 40m > Remaining Estimate: 20h 20m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_131] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) > ~[?:1.8.0_131] at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=713554=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-713554 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 24/Jan/22 07:59 Start Date: 24/Jan/22 07:59 Worklog Time Spent: 10m Work Description: lipzhu opened a new pull request #599: URL: https://github.com/apache/griffin/pull/599 **What changes were proposed in this pull request?** Support the Hive Metastore client authentication method via **kerberos** **Does this PR introduce any user-facing change?** No. **How was this patch tested?** Unit Tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 713554) Remaining Estimate: 20.5h (was: 20h 40m) Time Spent: 3.5h (was: 3h 20m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 3.5h > Remaining Estimate: 20.5h > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_131] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_131] at
[jira] [Work logged] (GRIFFIN-362) Oracle connection for Apache Griffin
[ https://issues.apache.org/jira/browse/GRIFFIN-362?focusedWorklogId=713495=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-713495 ] ASF GitHub Bot logged work on GRIFFIN-362: -- Author: ASF GitHub Bot Created on: 24/Jan/22 05:21 Start Date: 24/Jan/22 05:21 Worklog Time Spent: 10m Work Description: asfgit closed pull request #597: URL: https://github.com/apache/griffin/pull/597 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 713495) Time Spent: 50m (was: 40m) > Oracle connection for Apache Griffin > > > Key: GRIFFIN-362 > URL: https://issues.apache.org/jira/browse/GRIFFIN-362 > Project: Griffin > Issue Type: Bug > Components: accuracy-batch >Affects Versions: 0.6.0 > Environment: Dev >Reporter: Praveen Kurup >Priority: Blocker > Attachments: image-2021-04-27-23-07-33-681.png > > Time Spent: 50m > Remaining Estimate: 0h > > Hello Team, > We are doing a POC using Apache griffin for data quality projects. > We have a requirement to check data quality between Oracle and Hive tables. > Hive connection is working fine, but we are not able to establish Oracle > connection. > I would like to understand if Apache Griffin supports jdbc Oracle connection > using oracle.jdbc.driver.OracleDriver driver. I tried using Mysql jdbc > connection template to pass Oracle connection details, however it didn't > work. I am getting below error: > {color:#FF}ERROR griffin: JDBC driver oracle.jdbc.driver.OracleDriver > provided is not found in class path{color} > {color:#FF}java.lang.ClassNotFoundException: > oracle.jdbc.driver.OracleDriver{color} > {color:#FF}!image-2021-04-27-23-07-33-681.png|width=385,height=154!{color} > {color:#172b4d}Please let me know if there is any way to establish Oracle > database connectivity from Griffin.{color} > {color:#172b4d}Also, please share if there are any documentations available > to achieve the same.{color} > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-362) Oracle connection for Apache Griffin
[ https://issues.apache.org/jira/browse/GRIFFIN-362?focusedWorklogId=713494=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-713494 ] ASF GitHub Bot logged work on GRIFFIN-362: -- Author: ASF GitHub Bot Created on: 24/Jan/22 05:19 Start Date: 24/Jan/22 05:19 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #597: URL: https://github.com/apache/griffin/pull/597#issuecomment-1019731142 LGTM, merging. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 713494) Time Spent: 40m (was: 0.5h) > Oracle connection for Apache Griffin > > > Key: GRIFFIN-362 > URL: https://issues.apache.org/jira/browse/GRIFFIN-362 > Project: Griffin > Issue Type: Bug > Components: accuracy-batch >Affects Versions: 0.6.0 > Environment: Dev >Reporter: Praveen Kurup >Priority: Blocker > Attachments: image-2021-04-27-23-07-33-681.png > > Time Spent: 40m > Remaining Estimate: 0h > > Hello Team, > We are doing a POC using Apache griffin for data quality projects. > We have a requirement to check data quality between Oracle and Hive tables. > Hive connection is working fine, but we are not able to establish Oracle > connection. > I would like to understand if Apache Griffin supports jdbc Oracle connection > using oracle.jdbc.driver.OracleDriver driver. I tried using Mysql jdbc > connection template to pass Oracle connection details, however it didn't > work. I am getting below error: > {color:#FF}ERROR griffin: JDBC driver oracle.jdbc.driver.OracleDriver > provided is not found in class path{color} > {color:#FF}java.lang.ClassNotFoundException: > oracle.jdbc.driver.OracleDriver{color} > {color:#FF}!image-2021-04-27-23-07-33-681.png|width=385,height=154!{color} > {color:#172b4d}Please let me know if there is any way to establish Oracle > database connectivity from Griffin.{color} > {color:#172b4d}Also, please share if there are any documentations available > to achieve the same.{color} > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-369) Bug fix for avro format in data connector
[ https://issues.apache.org/jira/browse/GRIFFIN-369?focusedWorklogId=713493=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-713493 ] ASF GitHub Bot logged work on GRIFFIN-369: -- Author: ASF GitHub Bot Created on: 24/Jan/22 05:10 Start Date: 24/Jan/22 05:10 Worklog Time Spent: 10m Work Description: chitralverma closed pull request #598: URL: https://github.com/apache/griffin/pull/598 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 713493) Time Spent: 1h (was: 50m) > Bug fix for avro format in data connector > - > > Key: GRIFFIN-369 > URL: https://issues.apache.org/jira/browse/GRIFFIN-369 > Project: Griffin > Issue Type: Bug >Reporter: Zhu, Lipeng >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > built-in AVRO data source implementation is released in spark 2.4.0. > In spark 2.3.x, we need to use > com.databricks.spark.avro. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-369) Bug fix for avro format in data connector
[ https://issues.apache.org/jira/browse/GRIFFIN-369?focusedWorklogId=713492=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-713492 ] ASF GitHub Bot logged work on GRIFFIN-369: -- Author: ASF GitHub Bot Created on: 24/Jan/22 05:10 Start Date: 24/Jan/22 05:10 Worklog Time Spent: 10m Work Description: lipzhu opened a new pull request #598: URL: https://github.com/apache/griffin/pull/598 **What changes were proposed in this pull request?** Built in Avro format is released in Spark 2.4.0,https://issues.apache.org/jira/browse/SPARK-24768 For Griffin, we still need to convert the Avro to com.databricks.spark.avro in Spark 2.3.x environment. **Does this PR introduce any user-facing change?** No. **How was this patch tested?** Unit Tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 713492) Time Spent: 50m (was: 40m) > Bug fix for avro format in data connector > - > > Key: GRIFFIN-369 > URL: https://issues.apache.org/jira/browse/GRIFFIN-369 > Project: Griffin > Issue Type: Bug >Reporter: Zhu, Lipeng >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > built-in AVRO data source implementation is released in spark 2.4.0. > In spark 2.3.x, we need to use > com.databricks.spark.avro. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-369) Bug fix for avro format in data connector
[ https://issues.apache.org/jira/browse/GRIFFIN-369?focusedWorklogId=713491=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-713491 ] ASF GitHub Bot logged work on GRIFFIN-369: -- Author: ASF GitHub Bot Created on: 24/Jan/22 05:06 Start Date: 24/Jan/22 05:06 Worklog Time Spent: 10m Work Description: asfgit closed pull request #598: URL: https://github.com/apache/griffin/pull/598 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 713491) Time Spent: 40m (was: 0.5h) > Bug fix for avro format in data connector > - > > Key: GRIFFIN-369 > URL: https://issues.apache.org/jira/browse/GRIFFIN-369 > Project: Griffin > Issue Type: Bug >Reporter: Zhu, Lipeng >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > built-in AVRO data source implementation is released in spark 2.4.0. > In spark 2.3.x, we need to use > com.databricks.spark.avro. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-369) Bug fix for avro format in data connector
[ https://issues.apache.org/jira/browse/GRIFFIN-369?focusedWorklogId=713489=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-713489 ] ASF GitHub Bot logged work on GRIFFIN-369: -- Author: ASF GitHub Bot Created on: 24/Jan/22 04:50 Start Date: 24/Jan/22 04:50 Worklog Time Spent: 10m Work Description: chitralverma commented on a change in pull request #598: URL: https://github.com/apache/griffin/pull/598#discussion_r790412145 ## File path: measure/src/main/scala/org/apache/griffin/measure/datasource/connector/batch/FileBasedDataConnector.scala ## @@ -79,7 +79,8 @@ case class FileBasedDataConnector( SupportedFormats.contains(format), s"Invalid format '$format' specified. Must be one of ${SupportedFormats.mkString("['", "', '", "']")}") - if (format.equalsIgnoreCase("avro") && sparkSession.version < "2.3.0") { + // Use old implementation for AVRO format if current spark version is not 2.4.x and above + if (format.equalsIgnoreCase("avro") && sparkSession.version < "2.4.0") { Review comment: ```suggestion if ("avro".equalsIgnoreCase(format) && sparkSession.version < "2.4.0") { ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 713489) Time Spent: 0.5h (was: 20m) > Bug fix for avro format in data connector > - > > Key: GRIFFIN-369 > URL: https://issues.apache.org/jira/browse/GRIFFIN-369 > Project: Griffin > Issue Type: Bug >Reporter: Zhu, Lipeng >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > built-in AVRO data source implementation is released in spark 2.4.0. > In spark 2.3.x, we need to use > com.databricks.spark.avro. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-369) Bug fix for avro format in data connector
[ https://issues.apache.org/jira/browse/GRIFFIN-369?focusedWorklogId=713488=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-713488 ] ASF GitHub Bot logged work on GRIFFIN-369: -- Author: ASF GitHub Bot Created on: 24/Jan/22 04:48 Start Date: 24/Jan/22 04:48 Worklog Time Spent: 10m Work Description: chitralverma commented on a change in pull request #598: URL: https://github.com/apache/griffin/pull/598#discussion_r790411689 ## File path: measure/src/main/scala/org/apache/griffin/measure/datasource/connector/batch/FileBasedDataConnector.scala ## @@ -79,7 +79,8 @@ case class FileBasedDataConnector( SupportedFormats.contains(format), s"Invalid format '$format' specified. Must be one of ${SupportedFormats.mkString("['", "', '", "']")}") - if (format.equalsIgnoreCase("avro") && sparkSession.version < "2.3.0") { + // built-in AVRO data source implementation is released in spark 2.4.0 Review comment: Suggestion: ```suggestion // Use old implementation for AVRO format if current spark version is not 2.4.x and above ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 713488) Time Spent: 20m (was: 10m) > Bug fix for avro format in data connector > - > > Key: GRIFFIN-369 > URL: https://issues.apache.org/jira/browse/GRIFFIN-369 > Project: Griffin > Issue Type: Bug >Reporter: Zhu, Lipeng >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > built-in AVRO data source implementation is released in spark 2.4.0. > In spark 2.3.x, we need to use > com.databricks.spark.avro. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-367) Deployment guide doc update
[ https://issues.apache.org/jira/browse/GRIFFIN-367?focusedWorklogId=711893=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-711893 ] ASF GitHub Bot logged work on GRIFFIN-367: -- Author: ASF GitHub Bot Created on: 20/Jan/22 06:53 Start Date: 20/Jan/22 06:53 Worklog Time Spent: 10m Work Description: whhe merged pull request #596: URL: https://github.com/apache/griffin/pull/596 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 711893) Time Spent: 1h (was: 50m) > Deployment guide doc update > --- > > Key: GRIFFIN-367 > URL: https://issues.apache.org/jira/browse/GRIFFIN-367 > Project: Griffin > Issue Type: Improvement >Reporter: Zhu, Lipeng >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > > Update deployment guide. > # Move the service-${version}.tar.gz from parent target folder to service > module target. > # Remove the diff change history for hive-site.xml. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-369) Bug fix for avro format in data connector
[ https://issues.apache.org/jira/browse/GRIFFIN-369?focusedWorklogId=710482=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-710482 ] ASF GitHub Bot logged work on GRIFFIN-369: -- Author: ASF GitHub Bot Created on: 18/Jan/22 12:58 Start Date: 18/Jan/22 12:58 Worklog Time Spent: 10m Work Description: lipzhu opened a new pull request #598: URL: https://github.com/apache/griffin/pull/598 **What changes were proposed in this pull request?** Built in Avro format is released in Spark 2.4.0,https://issues.apache.org/jira/browse/SPARK-24768 For Griffin, we still need to convert the Avro to com.databricks.spark.avro in Spark 2.3.x environment. **Does this PR introduce any user-facing change?** No. **How was this patch tested?** Unit Tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 710482) Remaining Estimate: 0h Time Spent: 10m > Bug fix for avro format in data connector > - > > Key: GRIFFIN-369 > URL: https://issues.apache.org/jira/browse/GRIFFIN-369 > Project: Griffin > Issue Type: Bug >Reporter: Zhu, Lipeng >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > built-in AVRO data source implementation is released in spark 2.4.0. > In spark 2.3.x, we need to use > com.databricks.spark.avro. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-362) Oracle connection for Apache Griffin
[ https://issues.apache.org/jira/browse/GRIFFIN-362?focusedWorklogId=710228=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-710228 ] ASF GitHub Bot logged work on GRIFFIN-362: -- Author: ASF GitHub Bot Created on: 18/Jan/22 03:35 Start Date: 18/Jan/22 03:35 Worklog Time Spent: 10m Work Description: lipzhu commented on a change in pull request #597: URL: https://github.com/apache/griffin/pull/597#discussion_r786393569 ## File path: measure/pom.xml ## @@ -53,6 +53,8 @@ under the License. 2.3.0 2.1.0 2.10.0 +9.4.1212.jre7 Review comment: > @lipzhu is there a valid JIRA ticket for this feature? If not then please create one, link it to this PR and also add a valid description to the PR. > > Most of the PRs follow this process. Ref: #593 @chitralverma Thanks for you review, this PR try to resolve https://issues.apache.org/jira/browse/GRIFFIN-362. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 710228) Time Spent: 0.5h (was: 20m) > Oracle connection for Apache Griffin > > > Key: GRIFFIN-362 > URL: https://issues.apache.org/jira/browse/GRIFFIN-362 > Project: Griffin > Issue Type: Bug > Components: accuracy-batch >Affects Versions: 0.6.0 > Environment: Dev >Reporter: Praveen Kurup >Priority: Blocker > Attachments: image-2021-04-27-23-07-33-681.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > Hello Team, > We are doing a POC using Apache griffin for data quality projects. > We have a requirement to check data quality between Oracle and Hive tables. > Hive connection is working fine, but we are not able to establish Oracle > connection. > I would like to understand if Apache Griffin supports jdbc Oracle connection > using oracle.jdbc.driver.OracleDriver driver. I tried using Mysql jdbc > connection template to pass Oracle connection details, however it didn't > work. I am getting below error: > {color:#FF}ERROR griffin: JDBC driver oracle.jdbc.driver.OracleDriver > provided is not found in class path{color} > {color:#FF}java.lang.ClassNotFoundException: > oracle.jdbc.driver.OracleDriver{color} > {color:#FF}!image-2021-04-27-23-07-33-681.png|width=385,height=154!{color} > {color:#172b4d}Please let me know if there is any way to establish Oracle > database connectivity from Griffin.{color} > {color:#172b4d}Also, please share if there are any documentations available > to achieve the same.{color} > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-367) Deployment guide doc update
[ https://issues.apache.org/jira/browse/GRIFFIN-367?focusedWorklogId=709952=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-709952 ] ASF GitHub Bot logged work on GRIFFIN-367: -- Author: ASF GitHub Bot Created on: 17/Jan/22 14:11 Start Date: 17/Jan/22 14:11 Worklog Time Spent: 10m Work Description: chitralverma commented on a change in pull request #596: URL: https://github.com/apache/griffin/pull/596#discussion_r786042592 ## File path: service/pom.xml ## @@ -385,7 +385,6 @@ under the License. false false -../target Review comment: @lipzhu please update the docs instead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 709952) Time Spent: 50m (was: 40m) > Deployment guide doc update > --- > > Key: GRIFFIN-367 > URL: https://issues.apache.org/jira/browse/GRIFFIN-367 > Project: Griffin > Issue Type: Improvement >Reporter: Zhu, Lipeng >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Update deployment guide. > # Move the service-${version}.tar.gz from parent target folder to service > module target. > # Remove the diff change history for hive-site.xml. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-362) Oracle connection for Apache Griffin
[ https://issues.apache.org/jira/browse/GRIFFIN-362?focusedWorklogId=709951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-709951 ] ASF GitHub Bot logged work on GRIFFIN-362: -- Author: ASF GitHub Bot Created on: 17/Jan/22 14:07 Start Date: 17/Jan/22 14:07 Worklog Time Spent: 10m Work Description: chitralverma commented on a change in pull request #597: URL: https://github.com/apache/griffin/pull/597#discussion_r786035047 ## File path: measure/pom.xml ## @@ -53,6 +53,8 @@ under the License. 2.3.0 2.1.0 2.10.0 +9.4.1212.jre7 Review comment: This version is compiled with java7 and is from 2016. Please use a more recent and stable version. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 709951) Time Spent: 20m (was: 10m) > Oracle connection for Apache Griffin > > > Key: GRIFFIN-362 > URL: https://issues.apache.org/jira/browse/GRIFFIN-362 > Project: Griffin > Issue Type: Bug > Components: accuracy-batch >Affects Versions: 0.6.0 > Environment: Dev >Reporter: Praveen Kurup >Priority: Blocker > Attachments: image-2021-04-27-23-07-33-681.png > > Time Spent: 20m > Remaining Estimate: 0h > > Hello Team, > We are doing a POC using Apache griffin for data quality projects. > We have a requirement to check data quality between Oracle and Hive tables. > Hive connection is working fine, but we are not able to establish Oracle > connection. > I would like to understand if Apache Griffin supports jdbc Oracle connection > using oracle.jdbc.driver.OracleDriver driver. I tried using Mysql jdbc > connection template to pass Oracle connection details, however it didn't > work. I am getting below error: > {color:#FF}ERROR griffin: JDBC driver oracle.jdbc.driver.OracleDriver > provided is not found in class path{color} > {color:#FF}java.lang.ClassNotFoundException: > oracle.jdbc.driver.OracleDriver{color} > {color:#FF}!image-2021-04-27-23-07-33-681.png|width=385,height=154!{color} > {color:#172b4d}Please let me know if there is any way to establish Oracle > database connectivity from Griffin.{color} > {color:#172b4d}Also, please share if there are any documentations available > to achieve the same.{color} > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-362) Oracle connection for Apache Griffin
[ https://issues.apache.org/jira/browse/GRIFFIN-362?focusedWorklogId=709948=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-709948 ] ASF GitHub Bot logged work on GRIFFIN-362: -- Author: ASF GitHub Bot Created on: 17/Jan/22 14:03 Start Date: 17/Jan/22 14:03 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #597: URL: https://github.com/apache/griffin/pull/597#issuecomment-1014578475 @lipzhu is there a valid JIRA ticket for this feature? If not then please create one, link it to this PR and also add a valid description to the PR. Most of the PRs follow this process. Ref: https://github.com/apache/griffin/pull/593 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 709948) Remaining Estimate: 0h Time Spent: 10m > Oracle connection for Apache Griffin > > > Key: GRIFFIN-362 > URL: https://issues.apache.org/jira/browse/GRIFFIN-362 > Project: Griffin > Issue Type: Bug > Components: accuracy-batch >Affects Versions: 0.6.0 > Environment: Dev >Reporter: Praveen Kurup >Priority: Blocker > Attachments: image-2021-04-27-23-07-33-681.png > > Time Spent: 10m > Remaining Estimate: 0h > > Hello Team, > We are doing a POC using Apache griffin for data quality projects. > We have a requirement to check data quality between Oracle and Hive tables. > Hive connection is working fine, but we are not able to establish Oracle > connection. > I would like to understand if Apache Griffin supports jdbc Oracle connection > using oracle.jdbc.driver.OracleDriver driver. I tried using Mysql jdbc > connection template to pass Oracle connection details, however it didn't > work. I am getting below error: > {color:#FF}ERROR griffin: JDBC driver oracle.jdbc.driver.OracleDriver > provided is not found in class path{color} > {color:#FF}java.lang.ClassNotFoundException: > oracle.jdbc.driver.OracleDriver{color} > {color:#FF}!image-2021-04-27-23-07-33-681.png|width=385,height=154!{color} > {color:#172b4d}Please let me know if there is any way to establish Oracle > database connectivity from Griffin.{color} > {color:#172b4d}Also, please share if there are any documentations available > to achieve the same.{color} > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-367) Deployment guide doc update
[ https://issues.apache.org/jira/browse/GRIFFIN-367?focusedWorklogId=709887=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-709887 ] ASF GitHub Bot logged work on GRIFFIN-367: -- Author: ASF GitHub Bot Created on: 17/Jan/22 12:00 Start Date: 17/Jan/22 12:00 Worklog Time Spent: 10m Work Description: whhe commented on a change in pull request #596: URL: https://github.com/apache/griffin/pull/596#discussion_r785921305 ## File path: service/pom.xml ## @@ -385,7 +385,6 @@ under the License. false false -../target Review comment: > The service-${version}.tar.gz location is not align with the Documention in https://github.com/apache/griffin/blob/master/griffin-doc/deploy/deploy-guide.md, or we can update the documentation to remove the confuse. > > > > > It's easy to build Griffin, just run maven command mvn clean install. Successfully building, you can get service-${version}.tar.gz and measure-${version}.jar from target folder in **service** and **measure** module. You're right. I think it's better to update the doc, and the following commands should also be modified at the same time. ```bash cd $GRIFFIN_INSTALL_DIR tar -zxvf service-${version}.tar.gz ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 709887) Time Spent: 40m (was: 0.5h) > Deployment guide doc update > --- > > Key: GRIFFIN-367 > URL: https://issues.apache.org/jira/browse/GRIFFIN-367 > Project: Griffin > Issue Type: Improvement >Reporter: Zhu, Lipeng >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > Update deployment guide. > # Move the service-${version}.tar.gz from parent target folder to service > module target. > # Remove the diff change history for hive-site.xml. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-367) Deployment guide doc update
[ https://issues.apache.org/jira/browse/GRIFFIN-367?focusedWorklogId=709852=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-709852 ] ASF GitHub Bot logged work on GRIFFIN-367: -- Author: ASF GitHub Bot Created on: 17/Jan/22 11:09 Start Date: 17/Jan/22 11:09 Worklog Time Spent: 10m Work Description: lipzhu commented on a change in pull request #596: URL: https://github.com/apache/griffin/pull/596#discussion_r785865424 ## File path: service/pom.xml ## @@ -385,7 +385,6 @@ under the License. false false -../target Review comment: The service-${version}.tar.gz location is not align with the Documention in https://github.com/apache/griffin/blob/master/griffin-doc/deploy/deploy-guide.md, or we can update the documentation to remove the confuse. >>>It's easy to build Griffin, just run maven command mvn clean install. Successfully building, you can get service-${version}.tar.gz and measure-${version}.jar from target folder in **service** and **measure** module. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 709852) Time Spent: 0.5h (was: 20m) > Deployment guide doc update > --- > > Key: GRIFFIN-367 > URL: https://issues.apache.org/jira/browse/GRIFFIN-367 > Project: Griffin > Issue Type: Improvement >Reporter: Zhu, Lipeng >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Update deployment guide. > # Move the service-${version}.tar.gz from parent target folder to service > module target. > # Remove the diff change history for hive-site.xml. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-367) Deployment guide doc update
[ https://issues.apache.org/jira/browse/GRIFFIN-367?focusedWorklogId=709847=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-709847 ] ASF GitHub Bot logged work on GRIFFIN-367: -- Author: ASF GitHub Bot Created on: 17/Jan/22 10:57 Start Date: 17/Jan/22 10:57 Worklog Time Spent: 10m Work Description: whhe commented on a change in pull request #596: URL: https://github.com/apache/griffin/pull/596#discussion_r785853228 ## File path: service/pom.xml ## @@ -385,7 +385,6 @@ under the License. false false -../target Review comment: Why move it? I think it's appropriate to put the deployment package at root directory. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 709847) Time Spent: 20m (was: 10m) > Deployment guide doc update > --- > > Key: GRIFFIN-367 > URL: https://issues.apache.org/jira/browse/GRIFFIN-367 > Project: Griffin > Issue Type: Improvement >Reporter: Zhu, Lipeng >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > Update deployment guide. > # Move the service-${version}.tar.gz from parent target folder to service > module target. > # Remove the diff change history for hive-site.xml. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-367) Deployment guide doc update
[ https://issues.apache.org/jira/browse/GRIFFIN-367?focusedWorklogId=709739=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-709739 ] ASF GitHub Bot logged work on GRIFFIN-367: -- Author: ASF GitHub Bot Created on: 17/Jan/22 07:27 Start Date: 17/Jan/22 07:27 Worklog Time Spent: 10m Work Description: lipzhu opened a new pull request #596: URL: https://github.com/apache/griffin/pull/596 1. Move the service-${version}.tar.gz from parent target folder to service module target. 2. Remove the diff change history for hive-site.xml. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 709739) Remaining Estimate: 0h Time Spent: 10m > Deployment guide doc update > --- > > Key: GRIFFIN-367 > URL: https://issues.apache.org/jira/browse/GRIFFIN-367 > Project: Griffin > Issue Type: Improvement >Reporter: Zhu, Lipeng >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Update deployment guide. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (GRIFFIN-365) Measure Enhancements and Stability fixes
[ https://issues.apache.org/jira/browse/GRIFFIN-365?focusedWorklogId=659633=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-659633 ] ASF GitHub Bot logged work on GRIFFIN-365: -- Author: ASF GitHub Bot Created on: 04/Oct/21 15:12 Start Date: 04/Oct/21 15:12 Worklog Time Spent: 10m Work Description: whhe merged pull request #593: URL: https://github.com/apache/griffin/pull/593 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 659633) Time Spent: 40m (was: 0.5h) > Measure Enhancements and Stability fixes > > > Key: GRIFFIN-365 > URL: https://issues.apache.org/jira/browse/GRIFFIN-365 > Project: Griffin > Issue Type: Improvement > Components: Measure Module >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Critical > Time Spent: 40m > Remaining Estimate: 0h > > General updates and fixes to the new measures added as part of > [GRIFFIN-358|https://issues.apache.org/jira/projects/GRIFFIN/issues/GRIFFIN-358] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-365) Measure Enhancements and Stability fixes
[ https://issues.apache.org/jira/browse/GRIFFIN-365?focusedWorklogId=658112=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-658112 ] ASF GitHub Bot logged work on GRIFFIN-365: -- Author: ASF GitHub Bot Created on: 30/Sep/21 06:37 Start Date: 30/Sep/21 06:37 Worklog Time Spent: 10m Work Description: guoyuepeng commented on pull request #593: URL: https://github.com/apache/griffin/pull/593#issuecomment-930854008 LGTM. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 658112) Time Spent: 0.5h (was: 20m) > Measure Enhancements and Stability fixes > > > Key: GRIFFIN-365 > URL: https://issues.apache.org/jira/browse/GRIFFIN-365 > Project: Griffin > Issue Type: Improvement > Components: Measure Module >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Critical > Time Spent: 0.5h > Remaining Estimate: 0h > > General updates and fixes to the new measures added as part of > [GRIFFIN-358|https://issues.apache.org/jira/projects/GRIFFIN/issues/GRIFFIN-358] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-365) Measure Enhancements and Stability fixes
[ https://issues.apache.org/jira/browse/GRIFFIN-365?focusedWorklogId=655018=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-655018 ] ASF GitHub Bot logged work on GRIFFIN-365: -- Author: ASF GitHub Bot Created on: 24/Sep/21 16:26 Start Date: 24/Sep/21 16:26 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #593: URL: https://github.com/apache/griffin/pull/593#issuecomment-926762937 @wankunde @guoyuepeng Can you please review this. thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 655018) Time Spent: 20m (was: 10m) > Measure Enhancements and Stability fixes > > > Key: GRIFFIN-365 > URL: https://issues.apache.org/jira/browse/GRIFFIN-365 > Project: Griffin > Issue Type: Improvement > Components: Measure Module >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Critical > Time Spent: 20m > Remaining Estimate: 0h > > General updates and fixes to the new measures added as part of > [GRIFFIN-358|https://issues.apache.org/jira/projects/GRIFFIN/issues/GRIFFIN-358] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-365) Measure Enhancements and Stability fixes
[ https://issues.apache.org/jira/browse/GRIFFIN-365?focusedWorklogId=655015=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-655015 ] ASF GitHub Bot logged work on GRIFFIN-365: -- Author: ASF GitHub Bot Created on: 24/Sep/21 16:21 Start Date: 24/Sep/21 16:21 Worklog Time Spent: 10m Work Description: chitralverma opened a new pull request #593: URL: https://github.com/apache/griffin/pull/593 **What changes were proposed in this pull request?** General updates and fixes to the new measures added as part of GRIFFIN-358 Key changes: - Scapegoat code analysis and other minor changes to `pom.xml` - Handling of corner cases in measures - Better exception handling and logging for measures - Minor Updates to documentation and tests **Does this PR introduce any user-facing change?** Yes. Expression for completeness measure checks for complete data. How was this patch tested? Unit Tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 655015) Remaining Estimate: 0h Time Spent: 10m > Measure Enhancements and Stability fixes > > > Key: GRIFFIN-365 > URL: https://issues.apache.org/jira/browse/GRIFFIN-365 > Project: Griffin > Issue Type: Improvement > Components: Measure Module >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Critical > Time Spent: 10m > Remaining Estimate: 0h > > General updates and fixes to the new measures added as part of > [GRIFFIN-358|https://issues.apache.org/jira/projects/GRIFFIN/issues/GRIFFIN-358] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=637969=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-637969 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 15/Aug/21 00:02 Start Date: 15/Aug/21 00:02 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #592: URL: https://github.com/apache/griffin/pull/592#issuecomment-898972968 Automated Message: We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the 'no-pr-activity' tag! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 637969) Remaining Estimate: 20h 50m (was: 21h) Time Spent: 3h 10m (was: 3h) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 3h 10m > Remaining Estimate: 20h 50m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_131] at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=637970=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-637970 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 15/Aug/21 00:02 Start Date: 15/Aug/21 00:02 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #592: URL: https://github.com/apache/griffin/pull/592 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 637970) Remaining Estimate: 20h 40m (was: 20h 50m) Time Spent: 3h 20m (was: 3h 10m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 3h 20m > Remaining Estimate: 20h 40m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_131] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) > ~[?:1.8.0_131] at > org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154) >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=631916=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631916 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 31/Jul/21 00:02 Start Date: 31/Jul/21 00:02 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #592: URL: https://github.com/apache/griffin/pull/592#issuecomment-890259023 Automated Message: This PR is being labelled as stale and will be closed in next 15 days due to lack of activity. To avoid this push new commits or ask the committers for a review/ resolution. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 631916) Remaining Estimate: 21h (was: 21h 10m) Time Spent: 3h (was: 2h 50m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 3h > Remaining Estimate: 21h > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_131] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_131] at
[jira] [Work logged] (GRIFFIN-360) Improve merge_pr.py script
[ https://issues.apache.org/jira/browse/GRIFFIN-360?focusedWorklogId=619921=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-619921 ] ASF GitHub Bot logged work on GRIFFIN-360: -- Author: ASF GitHub Bot Created on: 07/Jul/21 11:29 Start Date: 07/Jul/21 11:29 Worklog Time Spent: 10m Work Description: guoyuepeng commented on pull request #590: URL: https://github.com/apache/griffin/pull/590#issuecomment-875526716 LTGM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 619921) Remaining Estimate: 0h (was: 10m) Time Spent: 1h (was: 50m) > Improve merge_pr.py script > -- > > Key: GRIFFIN-360 > URL: https://issues.apache.org/jira/browse/GRIFFIN-360 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Priority: Major > Original Estimate: 1h > Time Spent: 1h > Remaining Estimate: 0h > > The merge_pr.py script can be improved with many good-to-have changes like > below, > * allow python 3 compatibility > * better check for Jira dependency > * Updating Jira with more details like assignee and contributor details > * upgrading dependencies > Also added a requirements.txt file for installation of script dependencies. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-360) Improve merge_pr.py script
[ https://issues.apache.org/jira/browse/GRIFFIN-360?focusedWorklogId=619920=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-619920 ] ASF GitHub Bot logged work on GRIFFIN-360: -- Author: ASF GitHub Bot Created on: 07/Jul/21 11:29 Start Date: 07/Jul/21 11:29 Worklog Time Spent: 10m Work Description: guoyuepeng merged pull request #590: URL: https://github.com/apache/griffin/pull/590 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 619920) Remaining Estimate: 10m (was: 20m) Time Spent: 50m (was: 40m) > Improve merge_pr.py script > -- > > Key: GRIFFIN-360 > URL: https://issues.apache.org/jira/browse/GRIFFIN-360 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Priority: Major > Original Estimate: 1h > Time Spent: 50m > Remaining Estimate: 10m > > The merge_pr.py script can be improved with many good-to-have changes like > below, > * allow python 3 compatibility > * better check for Jira dependency > * Updating Jira with more details like assignee and contributor details > * upgrading dependencies > Also added a requirements.txt file for installation of script dependencies. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-347) Setup automated workflows
[ https://issues.apache.org/jira/browse/GRIFFIN-347?focusedWorklogId=619918=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-619918 ] ASF GitHub Bot logged work on GRIFFIN-347: -- Author: ASF GitHub Bot Created on: 07/Jul/21 11:29 Start Date: 07/Jul/21 11:29 Worklog Time Spent: 10m Work Description: guoyuepeng merged pull request #583: URL: https://github.com/apache/griffin/pull/583 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 619918) Time Spent: 2h (was: 1h 50m) > Setup automated workflows > - > > Key: GRIFFIN-347 > URL: https://issues.apache.org/jira/browse/GRIFFIN-347 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Trivial > Original Estimate: 10m > Time Spent: 2h > Remaining Estimate: 0h > > This ticket aims to set up the following 2 automation in Github. > *Check for Stale PRs/ Issues* > Add a GitHub Workflow that automatically checks for stale PRs and tags/ > closes them when inactive for a long duration. > {quote}This workflow will run every day at 00.00 UTC to check for any issues/ > PRs that have been inactive for over 30 and will close them in another 15 > days. > Additionally, the stale PRs will be marked with a {{no-pr-activity}} label. > PR s having {{awaiting-approval}}, {{work-in-progress}} or {{wip}} label are > excluded from this check. > {quote} > > *Greet new users* > {quote}Add a GitHub Workflow that automatically greets new users on their > first PR. > {quote} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-358) Rewrite the Rule/Measure implementations
[ https://issues.apache.org/jira/browse/GRIFFIN-358?focusedWorklogId=619305=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-619305 ] ASF GitHub Bot logged work on GRIFFIN-358: -- Author: ASF GitHub Bot Created on: 06/Jul/21 11:46 Start Date: 06/Jul/21 11:46 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #591: URL: https://github.com/apache/griffin/pull/591#issuecomment-874684760 Thanks for the merge! :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 619305) Time Spent: 1h 40m (was: 1.5h) > Rewrite the Rule/Measure implementations > > > Key: GRIFFIN-358 > URL: https://issues.apache.org/jira/browse/GRIFFIN-358 > Project: Griffin > Issue Type: New Feature >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > Current `RuleParams` can be of the following 3 DSL types, > * Data Ops (for source preprocessing) > * Griffin DSL > * SparkSQL > GriffinDSL allows the implementation of measures (DQ Types) like > Completeness, Accuracy, etc. > To enable such measures there is an extensive implementation of expression, > task hierarchies, parsing and most of this is heavily dependent on > scala-parser-combinators. > At the end of the implementation, Griffin DSL tries to mimic a SparkSQL-like > query but substitution of user-defined constraints. > This approach has some drawbacks, > * Suboptimal processing. While the transformation steps execute in parallel > on the driver, the data set is still scanned multiple times in parallel which > can cause inefficiencies on the SparkSession side and the internal task > scheduler was single-threaded. Even though the data set can be cached, still > it branched and crucial memory is required for holding the dataset rather > than processing it. > * Internal functions of Spark are not used. Data preprocessing has a very > limited scope currently even though we have 100s spark SQL functions > available for use. > * This blocks structured streaming. The manually constructed SQL queries > cause multiple aggregations in the same query on a streaming data set which > is not supported by Spark's Structured streaming. There are workarounds for > this but they all require rewriting the *Expr2DQSteps classes. > * Griffin DSL is SparkSQL like but not 100% compatible. Profiling measure > and SparkSQL are redundant functionalities > The proposed solution involves SparkSQL DSL based measures and some changes > to Rule Params. This will enhance the data pre proc flows and the measures > themselves -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-347) Setup automated workflows
[ https://issues.apache.org/jira/browse/GRIFFIN-347?focusedWorklogId=619307=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-619307 ] ASF GitHub Bot logged work on GRIFFIN-347: -- Author: ASF GitHub Bot Created on: 06/Jul/21 11:46 Start Date: 06/Jul/21 11:46 Worklog Time Spent: 10m Work Description: chitralverma edited a comment on pull request #583: URL: https://github.com/apache/griffin/pull/583#issuecomment-874686893 > @chitralverma > since we have [GRIFFIN-360], do we still need this PR? Yes @guoyuepeng. This MR is about automating closing of stale PRs on github when they lack any activity for long duration. I see that you recently closed a lot of PRs as they were very old. This PR does the same thing, it tags PRs as stale and closes them if there is no activity. GRIFFIN-360 is about enhancements to the PR merge script itself. So these are 2 separate issues. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 619307) Time Spent: 1h 50m (was: 1h 40m) > Setup automated workflows > - > > Key: GRIFFIN-347 > URL: https://issues.apache.org/jira/browse/GRIFFIN-347 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Trivial > Original Estimate: 10m > Time Spent: 1h 50m > Remaining Estimate: 0h > > This ticket aims to set up the following 2 automation in Github. > *Check for Stale PRs/ Issues* > Add a GitHub Workflow that automatically checks for stale PRs and tags/ > closes them when inactive for a long duration. > {quote}This workflow will run every day at 00.00 UTC to check for any issues/ > PRs that have been inactive for over 30 and will close them in another 15 > days. > Additionally, the stale PRs will be marked with a {{no-pr-activity}} label. > PR s having {{awaiting-approval}}, {{work-in-progress}} or {{wip}} label are > excluded from this check. > {quote} > > *Greet new users* > {quote}Add a GitHub Workflow that automatically greets new users on their > first PR. > {quote} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-347) Setup automated workflows
[ https://issues.apache.org/jira/browse/GRIFFIN-347?focusedWorklogId=619302=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-619302 ] ASF GitHub Bot logged work on GRIFFIN-347: -- Author: ASF GitHub Bot Created on: 06/Jul/21 11:45 Start Date: 06/Jul/21 11:45 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #583: URL: https://github.com/apache/griffin/pull/583#issuecomment-874686893 > @chitralverma > since we have [GRIFFIN-360], do we still need this PR? Yes @guoyuepeng. This MR is about automating closing of stale PRs on github when they lack any activity for long duration. I see that you recently closed a lot of PRs as they were very old. This PR does the same thing, it tags PRs as stale and closes them if there is no activity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 619302) Time Spent: 1h 40m (was: 1.5h) > Setup automated workflows > - > > Key: GRIFFIN-347 > URL: https://issues.apache.org/jira/browse/GRIFFIN-347 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Trivial > Original Estimate: 10m > Time Spent: 1h 40m > Remaining Estimate: 0h > > This ticket aims to set up the following 2 automation in Github. > *Check for Stale PRs/ Issues* > Add a GitHub Workflow that automatically checks for stale PRs and tags/ > closes them when inactive for a long duration. > {quote}This workflow will run every day at 00.00 UTC to check for any issues/ > PRs that have been inactive for over 30 and will close them in another 15 > days. > Additionally, the stale PRs will be marked with a {{no-pr-activity}} label. > PR s having {{awaiting-approval}}, {{work-in-progress}} or {{wip}} label are > excluded from this check. > {quote} > > *Greet new users* > {quote}Add a GitHub Workflow that automatically greets new users on their > first PR. > {quote} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-347) Setup automated workflows
[ https://issues.apache.org/jira/browse/GRIFFIN-347?focusedWorklogId=619011=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-619011 ] ASF GitHub Bot logged work on GRIFFIN-347: -- Author: ASF GitHub Bot Created on: 06/Jul/21 11:00 Start Date: 06/Jul/21 11:00 Worklog Time Spent: 10m Work Description: guoyuepeng commented on pull request #583: URL: https://github.com/apache/griffin/pull/583#issuecomment-874089188 @chitralverma since we have [GRIFFIN-360], do we still need this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 619011) Time Spent: 1.5h (was: 1h 20m) > Setup automated workflows > - > > Key: GRIFFIN-347 > URL: https://issues.apache.org/jira/browse/GRIFFIN-347 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Trivial > Original Estimate: 10m > Time Spent: 1.5h > Remaining Estimate: 0h > > This ticket aims to set up the following 2 automation in Github. > *Check for Stale PRs/ Issues* > Add a GitHub Workflow that automatically checks for stale PRs and tags/ > closes them when inactive for a long duration. > {quote}This workflow will run every day at 00.00 UTC to check for any issues/ > PRs that have been inactive for over 30 and will close them in another 15 > days. > Additionally, the stale PRs will be marked with a {{no-pr-activity}} label. > PR s having {{awaiting-approval}}, {{work-in-progress}} or {{wip}} label are > excluded from this check. > {quote} > > *Greet new users* > {quote}Add a GitHub Workflow that automatically greets new users on their > first PR. > {quote} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-358) Rewrite the Rule/Measure implementations
[ https://issues.apache.org/jira/browse/GRIFFIN-358?focusedWorklogId=618998=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618998 ] ASF GitHub Bot logged work on GRIFFIN-358: -- Author: ASF GitHub Bot Created on: 06/Jul/21 10:59 Start Date: 06/Jul/21 10:59 Worklog Time Spent: 10m Work Description: guoyuepeng merged pull request #591: URL: https://github.com/apache/griffin/pull/591 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 618998) Time Spent: 1.5h (was: 1h 20m) > Rewrite the Rule/Measure implementations > > > Key: GRIFFIN-358 > URL: https://issues.apache.org/jira/browse/GRIFFIN-358 > Project: Griffin > Issue Type: New Feature >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > Current `RuleParams` can be of the following 3 DSL types, > * Data Ops (for source preprocessing) > * Griffin DSL > * SparkSQL > GriffinDSL allows the implementation of measures (DQ Types) like > Completeness, Accuracy, etc. > To enable such measures there is an extensive implementation of expression, > task hierarchies, parsing and most of this is heavily dependent on > scala-parser-combinators. > At the end of the implementation, Griffin DSL tries to mimic a SparkSQL-like > query but substitution of user-defined constraints. > This approach has some drawbacks, > * Suboptimal processing. While the transformation steps execute in parallel > on the driver, the data set is still scanned multiple times in parallel which > can cause inefficiencies on the SparkSession side and the internal task > scheduler was single-threaded. Even though the data set can be cached, still > it branched and crucial memory is required for holding the dataset rather > than processing it. > * Internal functions of Spark are not used. Data preprocessing has a very > limited scope currently even though we have 100s spark SQL functions > available for use. > * This blocks structured streaming. The manually constructed SQL queries > cause multiple aggregations in the same query on a streaming data set which > is not supported by Spark's Structured streaming. There are workarounds for > this but they all require rewriting the *Expr2DQSteps classes. > * Griffin DSL is SparkSQL like but not 100% compatible. Profiling measure > and SparkSQL are redundant functionalities > The proposed solution involves SparkSQL DSL based measures and some changes > to Rule Params. This will enhance the data pre proc flows and the measures > themselves -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-358) Rewrite the Rule/Measure implementations
[ https://issues.apache.org/jira/browse/GRIFFIN-358?focusedWorklogId=618638=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618638 ] ASF GitHub Bot logged work on GRIFFIN-358: -- Author: ASF GitHub Bot Created on: 05/Jul/21 12:58 Start Date: 05/Jul/21 12:58 Worklog Time Spent: 10m Work Description: guoyuepeng merged pull request #591: URL: https://github.com/apache/griffin/pull/591 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 618638) Time Spent: 1h 20m (was: 1h 10m) > Rewrite the Rule/Measure implementations > > > Key: GRIFFIN-358 > URL: https://issues.apache.org/jira/browse/GRIFFIN-358 > Project: Griffin > Issue Type: New Feature >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Current `RuleParams` can be of the following 3 DSL types, > * Data Ops (for source preprocessing) > * Griffin DSL > * SparkSQL > GriffinDSL allows the implementation of measures (DQ Types) like > Completeness, Accuracy, etc. > To enable such measures there is an extensive implementation of expression, > task hierarchies, parsing and most of this is heavily dependent on > scala-parser-combinators. > At the end of the implementation, Griffin DSL tries to mimic a SparkSQL-like > query but substitution of user-defined constraints. > This approach has some drawbacks, > * Suboptimal processing. While the transformation steps execute in parallel > on the driver, the data set is still scanned multiple times in parallel which > can cause inefficiencies on the SparkSession side and the internal task > scheduler was single-threaded. Even though the data set can be cached, still > it branched and crucial memory is required for holding the dataset rather > than processing it. > * Internal functions of Spark are not used. Data preprocessing has a very > limited scope currently even though we have 100s spark SQL functions > available for use. > * This blocks structured streaming. The manually constructed SQL queries > cause multiple aggregations in the same query on a streaming data set which > is not supported by Spark's Structured streaming. There are workarounds for > this but they all require rewriting the *Expr2DQSteps classes. > * Griffin DSL is SparkSQL like but not 100% compatible. Profiling measure > and SparkSQL are redundant functionalities > The proposed solution involves SparkSQL DSL based measures and some changes > to Rule Params. This will enhance the data pre proc flows and the measures > themselves -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-347) Setup automated workflows
[ https://issues.apache.org/jira/browse/GRIFFIN-347?focusedWorklogId=618636=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618636 ] ASF GitHub Bot logged work on GRIFFIN-347: -- Author: ASF GitHub Bot Created on: 05/Jul/21 12:53 Start Date: 05/Jul/21 12:53 Worklog Time Spent: 10m Work Description: guoyuepeng commented on pull request #583: URL: https://github.com/apache/griffin/pull/583#issuecomment-874089188 @chitralverma since we have [GRIFFIN-360], do we still need this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 618636) Time Spent: 1h 20m (was: 1h 10m) > Setup automated workflows > - > > Key: GRIFFIN-347 > URL: https://issues.apache.org/jira/browse/GRIFFIN-347 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Trivial > Original Estimate: 10m > Time Spent: 1h 20m > Remaining Estimate: 0h > > This ticket aims to set up the following 2 automation in Github. > *Check for Stale PRs/ Issues* > Add a GitHub Workflow that automatically checks for stale PRs and tags/ > closes them when inactive for a long duration. > {quote}This workflow will run every day at 00.00 UTC to check for any issues/ > PRs that have been inactive for over 30 and will close them in another 15 > days. > Additionally, the stale PRs will be marked with a {{no-pr-activity}} label. > PR s having {{awaiting-approval}}, {{work-in-progress}} or {{wip}} label are > excluded from this check. > {quote} > > *Greet new users* > {quote}Add a GitHub Workflow that automatically greets new users on their > first PR. > {quote} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-358) Rewrite the Rule/Measure implementations
[ https://issues.apache.org/jira/browse/GRIFFIN-358?focusedWorklogId=618508=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618508 ] ASF GitHub Bot logged work on GRIFFIN-358: -- Author: ASF GitHub Bot Created on: 05/Jul/21 04:17 Start Date: 05/Jul/21 04:17 Worklog Time Spent: 10m Work Description: guoyuepeng commented on pull request #591: URL: https://github.com/apache/griffin/pull/591#issuecomment-873771974 LGTM. Will merge it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 618508) Time Spent: 1h 10m (was: 1h) > Rewrite the Rule/Measure implementations > > > Key: GRIFFIN-358 > URL: https://issues.apache.org/jira/browse/GRIFFIN-358 > Project: Griffin > Issue Type: New Feature >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Current `RuleParams` can be of the following 3 DSL types, > * Data Ops (for source preprocessing) > * Griffin DSL > * SparkSQL > GriffinDSL allows the implementation of measures (DQ Types) like > Completeness, Accuracy, etc. > To enable such measures there is an extensive implementation of expression, > task hierarchies, parsing and most of this is heavily dependent on > scala-parser-combinators. > At the end of the implementation, Griffin DSL tries to mimic a SparkSQL-like > query but substitution of user-defined constraints. > This approach has some drawbacks, > * Suboptimal processing. While the transformation steps execute in parallel > on the driver, the data set is still scanned multiple times in parallel which > can cause inefficiencies on the SparkSession side and the internal task > scheduler was single-threaded. Even though the data set can be cached, still > it branched and crucial memory is required for holding the dataset rather > than processing it. > * Internal functions of Spark are not used. Data preprocessing has a very > limited scope currently even though we have 100s spark SQL functions > available for use. > * This blocks structured streaming. The manually constructed SQL queries > cause multiple aggregations in the same query on a streaming data set which > is not supported by Spark's Structured streaming. There are workarounds for > this but they all require rewriting the *Expr2DQSteps classes. > * Griffin DSL is SparkSQL like but not 100% compatible. Profiling measure > and SparkSQL are redundant functionalities > The proposed solution involves SparkSQL DSL based measures and some changes > to Rule Params. This will enhance the data pre proc flows and the measures > themselves -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-358) Rewrite the Rule/Measure implementations
[ https://issues.apache.org/jira/browse/GRIFFIN-358?focusedWorklogId=618497=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618497 ] ASF GitHub Bot logged work on GRIFFIN-358: -- Author: ASF GitHub Bot Created on: 05/Jul/21 03:37 Start Date: 05/Jul/21 03:37 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #591: URL: https://github.com/apache/griffin/pull/591#issuecomment-873759586 > big patch. > let me go through it today. > > Thanks. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 618497) Time Spent: 1h (was: 50m) > Rewrite the Rule/Measure implementations > > > Key: GRIFFIN-358 > URL: https://issues.apache.org/jira/browse/GRIFFIN-358 > Project: Griffin > Issue Type: New Feature >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Current `RuleParams` can be of the following 3 DSL types, > * Data Ops (for source preprocessing) > * Griffin DSL > * SparkSQL > GriffinDSL allows the implementation of measures (DQ Types) like > Completeness, Accuracy, etc. > To enable such measures there is an extensive implementation of expression, > task hierarchies, parsing and most of this is heavily dependent on > scala-parser-combinators. > At the end of the implementation, Griffin DSL tries to mimic a SparkSQL-like > query but substitution of user-defined constraints. > This approach has some drawbacks, > * Suboptimal processing. While the transformation steps execute in parallel > on the driver, the data set is still scanned multiple times in parallel which > can cause inefficiencies on the SparkSession side and the internal task > scheduler was single-threaded. Even though the data set can be cached, still > it branched and crucial memory is required for holding the dataset rather > than processing it. > * Internal functions of Spark are not used. Data preprocessing has a very > limited scope currently even though we have 100s spark SQL functions > available for use. > * This blocks structured streaming. The manually constructed SQL queries > cause multiple aggregations in the same query on a streaming data set which > is not supported by Spark's Structured streaming. There are workarounds for > this but they all require rewriting the *Expr2DQSteps classes. > * Griffin DSL is SparkSQL like but not 100% compatible. Profiling measure > and SparkSQL are redundant functionalities > The proposed solution involves SparkSQL DSL based measures and some changes > to Rule Params. This will enhance the data pre proc flows and the measures > themselves -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-358) Rewrite the Rule/Measure implementations
[ https://issues.apache.org/jira/browse/GRIFFIN-358?focusedWorklogId=617949=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-617949 ] ASF GitHub Bot logged work on GRIFFIN-358: -- Author: ASF GitHub Bot Created on: 02/Jul/21 02:35 Start Date: 02/Jul/21 02:35 Worklog Time Spent: 10m Work Description: guoyuepeng commented on pull request #591: URL: https://github.com/apache/griffin/pull/591#issuecomment-872669275 big patch. let me go through it today. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 617949) Time Spent: 50m (was: 40m) > Rewrite the Rule/Measure implementations > > > Key: GRIFFIN-358 > URL: https://issues.apache.org/jira/browse/GRIFFIN-358 > Project: Griffin > Issue Type: New Feature >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Current `RuleParams` can be of the following 3 DSL types, > * Data Ops (for source preprocessing) > * Griffin DSL > * SparkSQL > GriffinDSL allows the implementation of measures (DQ Types) like > Completeness, Accuracy, etc. > To enable such measures there is an extensive implementation of expression, > task hierarchies, parsing and most of this is heavily dependent on > scala-parser-combinators. > At the end of the implementation, Griffin DSL tries to mimic a SparkSQL-like > query but substitution of user-defined constraints. > This approach has some drawbacks, > * Suboptimal processing. While the transformation steps execute in parallel > on the driver, the data set is still scanned multiple times in parallel which > can cause inefficiencies on the SparkSession side and the internal task > scheduler was single-threaded. Even though the data set can be cached, still > it branched and crucial memory is required for holding the dataset rather > than processing it. > * Internal functions of Spark are not used. Data preprocessing has a very > limited scope currently even though we have 100s spark SQL functions > available for use. > * This blocks structured streaming. The manually constructed SQL queries > cause multiple aggregations in the same query on a streaming data set which > is not supported by Spark's Structured streaming. There are workarounds for > this but they all require rewriting the *Expr2DQSteps classes. > * Griffin DSL is SparkSQL like but not 100% compatible. Profiling measure > and SparkSQL are redundant functionalities > The proposed solution involves SparkSQL DSL based measures and some changes > to Rule Params. This will enhance the data pre proc flows and the measures > themselves -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-358) Rewrite the Rule/Measure implementations
[ https://issues.apache.org/jira/browse/GRIFFIN-358?focusedWorklogId=616938=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-616938 ] ASF GitHub Bot logged work on GRIFFIN-358: -- Author: ASF GitHub Bot Created on: 30/Jun/21 06:57 Start Date: 30/Jun/21 06:57 Worklog Time Spent: 10m Work Description: chitralverma commented on a change in pull request #591: URL: https://github.com/apache/griffin/pull/591#discussion_r661182329 ## File path: measure/src/main/scala/org/apache/griffin/measure/datasource/connector/batch/ElasticSearchGriffinDataConnector.scala ## @@ -19,13 +19,13 @@ package org.apache.griffin.measure.datasource.connector.batch import java.io.{BufferedReader, ByteArrayInputStream, InputStreamReader} import java.net.URI -import java.util import scala.collection.mutable import scala.collection.mutable.ArrayBuffer import com.fasterxml.jackson.databind.{JsonNode, ObjectMapper} import com.fasterxml.jackson.module.scala.DefaultScalaModule +import java.util Review comment: done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 616938) Time Spent: 40m (was: 0.5h) > Rewrite the Rule/Measure implementations > > > Key: GRIFFIN-358 > URL: https://issues.apache.org/jira/browse/GRIFFIN-358 > Project: Griffin > Issue Type: New Feature >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Current `RuleParams` can be of the following 3 DSL types, > * Data Ops (for source preprocessing) > * Griffin DSL > * SparkSQL > GriffinDSL allows the implementation of measures (DQ Types) like > Completeness, Accuracy, etc. > To enable such measures there is an extensive implementation of expression, > task hierarchies, parsing and most of this is heavily dependent on > scala-parser-combinators. > At the end of the implementation, Griffin DSL tries to mimic a SparkSQL-like > query but substitution of user-defined constraints. > This approach has some drawbacks, > * Suboptimal processing. While the transformation steps execute in parallel > on the driver, the data set is still scanned multiple times in parallel which > can cause inefficiencies on the SparkSession side and the internal task > scheduler was single-threaded. Even though the data set can be cached, still > it branched and crucial memory is required for holding the dataset rather > than processing it. > * Internal functions of Spark are not used. Data preprocessing has a very > limited scope currently even though we have 100s spark SQL functions > available for use. > * This blocks structured streaming. The manually constructed SQL queries > cause multiple aggregations in the same query on a streaming data set which > is not supported by Spark's Structured streaming. There are workarounds for > this but they all require rewriting the *Expr2DQSteps classes. > * Griffin DSL is SparkSQL like but not 100% compatible. Profiling measure > and SparkSQL are redundant functionalities > The proposed solution involves SparkSQL DSL based measures and some changes > to Rule Params. This will enhance the data pre proc flows and the measures > themselves -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-360) Improve merge_pr.py script
[ https://issues.apache.org/jira/browse/GRIFFIN-360?focusedWorklogId=616883=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-616883 ] ASF GitHub Bot logged work on GRIFFIN-360: -- Author: ASF GitHub Bot Created on: 30/Jun/21 02:06 Start Date: 30/Jun/21 02:06 Worklog Time Spent: 10m Work Description: wankunde removed a comment on pull request #590: URL: https://github.com/apache/griffin/pull/590#issuecomment-871041023 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 616883) Remaining Estimate: 20m (was: 0.5h) Time Spent: 40m (was: 0.5h) > Improve merge_pr.py script > -- > > Key: GRIFFIN-360 > URL: https://issues.apache.org/jira/browse/GRIFFIN-360 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Priority: Major > Original Estimate: 1h > Time Spent: 40m > Remaining Estimate: 20m > > The merge_pr.py script can be improved with many good-to-have changes like > below, > * allow python 3 compatibility > * better check for Jira dependency > * Updating Jira with more details like assignee and contributor details > * upgrading dependencies > Also added a requirements.txt file for installation of script dependencies. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-360) Improve merge_pr.py script
[ https://issues.apache.org/jira/browse/GRIFFIN-360?focusedWorklogId=616881=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-616881 ] ASF GitHub Bot logged work on GRIFFIN-360: -- Author: ASF GitHub Bot Created on: 30/Jun/21 02:03 Start Date: 30/Jun/21 02:03 Worklog Time Spent: 10m Work Description: wankunde commented on pull request #590: URL: https://github.com/apache/griffin/pull/590#issuecomment-871041023 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 616881) Remaining Estimate: 0.5h (was: 40m) Time Spent: 0.5h (was: 20m) > Improve merge_pr.py script > -- > > Key: GRIFFIN-360 > URL: https://issues.apache.org/jira/browse/GRIFFIN-360 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Priority: Major > Original Estimate: 1h > Time Spent: 0.5h > Remaining Estimate: 0.5h > > The merge_pr.py script can be improved with many good-to-have changes like > below, > * allow python 3 compatibility > * better check for Jira dependency > * Updating Jira with more details like assignee and contributor details > * upgrading dependencies > Also added a requirements.txt file for installation of script dependencies. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-358) Rewrite the Rule/Measure implementations
[ https://issues.apache.org/jira/browse/GRIFFIN-358?focusedWorklogId=616878=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-616878 ] ASF GitHub Bot logged work on GRIFFIN-358: -- Author: ASF GitHub Bot Created on: 30/Jun/21 02:00 Start Date: 30/Jun/21 02:00 Worklog Time Spent: 10m Work Description: wankunde commented on a change in pull request #591: URL: https://github.com/apache/griffin/pull/591#discussion_r661074917 ## File path: measure/src/main/scala/org/apache/griffin/measure/datasource/connector/batch/ElasticSearchGriffinDataConnector.scala ## @@ -19,13 +19,13 @@ package org.apache.griffin.measure.datasource.connector.batch import java.io.{BufferedReader, ByteArrayInputStream, InputStreamReader} import java.net.URI -import java.util import scala.collection.mutable import scala.collection.mutable.ArrayBuffer import com.fasterxml.jackson.databind.{JsonNode, ObjectMapper} import com.fasterxml.jackson.module.scala.DefaultScalaModule +import java.util Review comment: reorder this import plz -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 616878) Time Spent: 0.5h (was: 20m) > Rewrite the Rule/Measure implementations > > > Key: GRIFFIN-358 > URL: https://issues.apache.org/jira/browse/GRIFFIN-358 > Project: Griffin > Issue Type: New Feature >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Current `RuleParams` can be of the following 3 DSL types, > * Data Ops (for source preprocessing) > * Griffin DSL > * SparkSQL > GriffinDSL allows the implementation of measures (DQ Types) like > Completeness, Accuracy, etc. > To enable such measures there is an extensive implementation of expression, > task hierarchies, parsing and most of this is heavily dependent on > scala-parser-combinators. > At the end of the implementation, Griffin DSL tries to mimic a SparkSQL-like > query but substitution of user-defined constraints. > This approach has some drawbacks, > * Suboptimal processing. While the transformation steps execute in parallel > on the driver, the data set is still scanned multiple times in parallel which > can cause inefficiencies on the SparkSession side and the internal task > scheduler was single-threaded. Even though the data set can be cached, still > it branched and crucial memory is required for holding the dataset rather > than processing it. > * Internal functions of Spark are not used. Data preprocessing has a very > limited scope currently even though we have 100s spark SQL functions > available for use. > * This blocks structured streaming. The manually constructed SQL queries > cause multiple aggregations in the same query on a streaming data set which > is not supported by Spark's Structured streaming. There are workarounds for > this but they all require rewriting the *Expr2DQSteps classes. > * Griffin DSL is SparkSQL like but not 100% compatible. Profiling measure > and SparkSQL are redundant functionalities > The proposed solution involves SparkSQL DSL based measures and some changes > to Rule Params. This will enhance the data pre proc flows and the measures > themselves -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=616872=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-616872 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 30/Jun/21 01:52 Start Date: 30/Jun/21 01:52 Worklog Time Spent: 10m Work Description: lovelyqincai commented on a change in pull request #592: URL: https://github.com/apache/griffin/pull/592#discussion_r661073537 ## File path: service/src/main/resources/application.properties ## @@ -22,6 +22,9 @@ spring.datasource.password=123456 spring.jpa.generate-ddl=true spring.datasource.driver-class-name=org.postgresql.Driver spring.jpa.show-sql=true +# kerberos +# add new configuration for kerberos file +krb5conf.path=/path/to/krb5conf/file Review comment: > Should we comment this configuration by default ? Excuse me,I have a little confused,would you explain it for me? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 616872) Remaining Estimate: 21h 10m (was: 21h 20m) Time Spent: 2h 50m (was: 2h 40m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 2h 50m > Remaining Estimate: 21h 10m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=616871=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-616871 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 30/Jun/21 01:52 Start Date: 30/Jun/21 01:52 Worklog Time Spent: 10m Work Description: lovelyqincai commented on a change in pull request #592: URL: https://github.com/apache/griffin/pull/592#discussion_r661073465 ## File path: service/src/main/resources/application.properties ## @@ -22,6 +22,9 @@ spring.datasource.password=123456 spring.jpa.generate-ddl=true spring.datasource.driver-class-name=org.postgresql.Driver spring.jpa.show-sql=true +# kerberos +# add new configuration for kerberos file +krb5conf.path=/path/to/krb5conf/file Review comment: Excuse me,I have a little confused,would you explain it for me? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 616871) Remaining Estimate: 21h 20m (was: 21.5h) Time Spent: 2h 40m (was: 2.5h) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 2h 40m > Remaining Estimate: 21h 20m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=616864=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-616864 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 30/Jun/21 01:44 Start Date: 30/Jun/21 01:44 Worklog Time Spent: 10m Work Description: lovelyqincai commented on a change in pull request #592: URL: https://github.com/apache/griffin/pull/592#discussion_r661071240 ## File path: service/src/main/java/org/apache/griffin/core/metastore/hive/HiveMetaStoreProxy.java ## @@ -55,6 +59,28 @@ Licensed to the Apache Software Foundation (ASF) under one private IMetaStoreClient client = null; +@Value("${hive.krb5conf.path}") +private String hiveKrb5confPath; + +@Value("${hive.keytab.path}") +private String keytabPath; + +@Value("${hive.keytab.user}") +private String keytabUser; + +@Value("${hive.need.kerberos}") +private String needKerberos; Review comment: @wankunde > Agree with @chitralverma > > @lovelyqincai Could we change `needKerberos` variable in `org.apache.griffin.core.metastore.hive.HiveMetaStoreServiceJdbcImpl` to boolean type in another PR. ok,i also agree with. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 616864) Remaining Estimate: 21.5h (was: 21h 40m) Time Spent: 2.5h (was: 2h 20m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 2.5h > Remaining Estimate: 21.5h > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0]
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=616863=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-616863 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 30/Jun/21 01:44 Start Date: 30/Jun/21 01:44 Worklog Time Spent: 10m Work Description: lovelyqincai commented on a change in pull request #592: URL: https://github.com/apache/griffin/pull/592#discussion_r661071078 ## File path: service/src/main/java/org/apache/griffin/core/metastore/hive/HiveMetaStoreProxy.java ## @@ -55,6 +59,28 @@ Licensed to the Apache Software Foundation (ASF) under one private IMetaStoreClient client = null; +@Value("${hive.krb5conf.path}") +private String hiveKrb5confPath; + +@Value("${hive.keytab.path}") +private String keytabPath; + +@Value("${hive.keytab.user}") +private String keytabUser; + +@Value("${hive.need.kerberos}") +private String needKerberos; Review comment: ok,i also agree with. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 616863) Remaining Estimate: 21h 40m (was: 21h 50m) Time Spent: 2h 20m (was: 2h 10m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 2h 20m > Remaining Estimate: 21h 40m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=616861=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-616861 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 30/Jun/21 01:41 Start Date: 30/Jun/21 01:41 Worklog Time Spent: 10m Work Description: wankunde commented on a change in pull request #592: URL: https://github.com/apache/griffin/pull/592#discussion_r661070386 ## File path: service/src/main/resources/application.properties ## @@ -22,6 +22,9 @@ spring.datasource.password=123456 spring.jpa.generate-ddl=true spring.datasource.driver-class-name=org.postgresql.Driver spring.jpa.show-sql=true +# kerberos +# add new configuration for kerberos file +krb5conf.path=/path/to/krb5conf/file Review comment: Should we comment this configuration by default ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 616861) Remaining Estimate: 21h 50m (was: 22h) Time Spent: 2h 10m (was: 2h) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 2h 10m > Remaining Estimate: 21h 50m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=616860=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-616860 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 30/Jun/21 01:39 Start Date: 30/Jun/21 01:39 Worklog Time Spent: 10m Work Description: wankunde commented on a change in pull request #592: URL: https://github.com/apache/griffin/pull/592#discussion_r661069716 ## File path: service/src/main/java/org/apache/griffin/core/metastore/hive/HiveMetaStoreProxy.java ## @@ -55,6 +59,28 @@ Licensed to the Apache Software Foundation (ASF) under one private IMetaStoreClient client = null; +@Value("${hive.krb5conf.path}") +private String hiveKrb5confPath; + +@Value("${hive.keytab.path}") +private String keytabPath; + +@Value("${hive.keytab.user}") +private String keytabUser; + +@Value("${hive.need.kerberos}") +private String needKerberos; Review comment: Agree with @chitralverma @lovelyqincai Could we change `needKerberos` variable in `org.apache.griffin.core.metastore.hive.HiveMetaStoreServiceJdbcImpl` to boolean type in another PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 616860) Remaining Estimate: 22h (was: 22h 10m) Time Spent: 2h (was: 1h 50m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 2h > Remaining Estimate: 22h > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=615383=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615383 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 27/Jun/21 11:09 Start Date: 27/Jun/21 11:09 Worklog Time Spent: 10m Work Description: lovelyqincai commented on a change in pull request #592: URL: https://github.com/apache/griffin/pull/592#discussion_r659304138 ## File path: service/src/main/java/org/apache/griffin/core/metastore/hive/HiveMetaStoreProxy.java ## @@ -55,6 +59,28 @@ Licensed to the Apache Software Foundation (ASF) under one private IMetaStoreClient client = null; +@Value("${hive.krb5conf.path}") +private String hiveKrb5confPath; + +@Value("${hive.keytab.path}") Review comment: > Since kerberos doesnt apply to just hive but hadoop services in general, the "hive." prefix can be removed from all these configs. emmm, kerberos have many keytabs for different user,like hive,hdfs,livy and so on. i can see many keytab files name like hive.keytab,hdfs.keytab... how about keep the prefix? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 615383) Remaining Estimate: 22h 20m (was: 22.5h) Time Spent: 1h 40m (was: 1.5h) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 1h 40m > Remaining Estimate: 22h 20m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=615384=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615384 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 27/Jun/21 11:09 Start Date: 27/Jun/21 11:09 Worklog Time Spent: 10m Work Description: lovelyqincai commented on a change in pull request #592: URL: https://github.com/apache/griffin/pull/592#discussion_r659304138 ## File path: service/src/main/java/org/apache/griffin/core/metastore/hive/HiveMetaStoreProxy.java ## @@ -55,6 +59,28 @@ Licensed to the Apache Software Foundation (ASF) under one private IMetaStoreClient client = null; +@Value("${hive.krb5conf.path}") +private String hiveKrb5confPath; + +@Value("${hive.keytab.path}") Review comment: > Since kerberos doesnt apply to just hive but hadoop services in general, the "hive." prefix can be removed from all these configs. emmm, kerberos have many keytabs for different user,like hive,hdfs,livy and so on. i can see many keytab files name like hive.keytab,hdfs.keytab... how about keep the prefix? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 615384) Remaining Estimate: 22h 10m (was: 22h 20m) Time Spent: 1h 50m (was: 1h 40m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 1h 50m > Remaining Estimate: 22h 10m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=615382=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615382 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 27/Jun/21 10:47 Start Date: 27/Jun/21 10:47 Worklog Time Spent: 10m Work Description: lovelyqincai commented on pull request #592: URL: https://github.com/apache/griffin/pull/592#issuecomment-869141142 > @amazingSaltFish can you add some test cases and add more details to this JIRA and this PR. I lost this scene, but I solved the problem this way... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 615382) Remaining Estimate: 22.5h (was: 22h 40m) Time Spent: 1.5h (was: 1h 20m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 1.5h > Remaining Estimate: 22.5h > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_131] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) > ~[?:1.8.0_131]
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=615381=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615381 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 27/Jun/21 10:45 Start Date: 27/Jun/21 10:45 Worklog Time Spent: 10m Work Description: lovelyqincai commented on a change in pull request #592: URL: https://github.com/apache/griffin/pull/592#discussion_r659301139 ## File path: service/src/main/java/org/apache/griffin/core/metastore/hive/HiveMetaStoreProxy.java ## @@ -55,6 +59,28 @@ Licensed to the Apache Software Foundation (ASF) under one private IMetaStoreClient client = null; +@Value("${hive.krb5conf.path}") +private String hiveKrb5confPath; + +@Value("${hive.keytab.path}") Review comment: ok, I accept your suggestion -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 615381) Remaining Estimate: 22h 40m (was: 22h 50m) Time Spent: 1h 20m (was: 1h 10m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 1h 20m > Remaining Estimate: 22h 40m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=615379=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615379 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 27/Jun/21 10:28 Start Date: 27/Jun/21 10:28 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #592: URL: https://github.com/apache/griffin/pull/592#issuecomment-869138900 @amazingSaltFish can you add some test cases and add more details to this JIRA and this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 615379) Remaining Estimate: 23h (was: 23h 10m) Time Spent: 1h (was: 50m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 1h > Remaining Estimate: 23h > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_131] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) > ~[?:1.8.0_131] at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=615378=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615378 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 27/Jun/21 10:27 Start Date: 27/Jun/21 10:27 Worklog Time Spent: 10m Work Description: chitralverma commented on a change in pull request #592: URL: https://github.com/apache/griffin/pull/592#discussion_r659298650 ## File path: service/src/main/java/org/apache/griffin/core/metastore/hive/HiveMetaStoreProxy.java ## @@ -55,6 +59,28 @@ Licensed to the Apache Software Foundation (ASF) under one private IMetaStoreClient client = null; +@Value("${hive.krb5conf.path}") +private String hiveKrb5confPath; + +@Value("${hive.keytab.path}") +private String keytabPath; + +@Value("${hive.keytab.user}") +private String keytabUser; + +@Value("${hive.need.kerberos}") +private String needKerberos; + +@PostConstruct +public void init() throws IOException { +if ( needKerberos != null && "true".equalsIgnoreCase(needKerberos) && hiveKrb5confPath != null) { +System.setProperty("java.security.krb5.conf", hiveKrb5confPath); Review comment: This can be a constant. ## File path: service/src/main/java/org/apache/griffin/core/metastore/hive/HiveMetaStoreProxy.java ## @@ -55,6 +59,28 @@ Licensed to the Apache Software Foundation (ASF) under one private IMetaStoreClient client = null; +@Value("${hive.krb5conf.path}") +private String hiveKrb5confPath; + +@Value("${hive.keytab.path}") Review comment: Since kerberos doesnt apply to just hive but hadoop services in general, the "hive." prefix can be removed from all these configs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 615378) Remaining Estimate: 23h 10m (was: 23h 20m) Time Spent: 50m (was: 40m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 50m > Remaining Estimate: 23h 10m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=615364=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615364 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 27/Jun/21 08:35 Start Date: 27/Jun/21 08:35 Worklog Time Spent: 10m Work Description: chitralverma commented on a change in pull request #592: URL: https://github.com/apache/griffin/pull/592#discussion_r659285098 ## File path: service/src/main/java/org/apache/griffin/core/metastore/hive/HiveMetaStoreProxy.java ## @@ -55,6 +59,28 @@ Licensed to the Apache Software Foundation (ASF) under one private IMetaStoreClient client = null; +@Value("${hive.krb5conf.path}") +private String hiveKrb5confPath; + +@Value("${hive.keytab.path}") +private String keytabPath; + +@Value("${hive.keytab.user}") +private String keytabUser; + +@Value("${hive.need.kerberos}") +private String needKerberos; Review comment: Can be a boolean. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@griffin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 615364) Remaining Estimate: 23h 20m (was: 23.5h) Time Spent: 40m (was: 0.5h) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 40m > Remaining Estimate: 23h 20m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=614639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614639 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 24/Jun/21 17:45 Start Date: 24/Jun/21 17:45 Worklog Time Spent: 10m Work Description: lovelyqincai removed a comment on pull request #592: URL: https://github.com/apache/griffin/pull/592#issuecomment-864589726 why my pr has fail,who can tell me what should i do? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 614639) Remaining Estimate: 23.5h (was: 23h 40m) Time Spent: 0.5h (was: 20m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 0.5h > Remaining Estimate: 23.5h > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_131] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) > ~[?:1.8.0_131] at > org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154) >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=612388=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612388 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 20/Jun/21 17:54 Start Date: 20/Jun/21 17:54 Worklog Time Spent: 10m Work Description: lovelyqincai commented on pull request #592: URL: https://github.com/apache/griffin/pull/592#issuecomment-864589726 why my pr has fail,who can tell me what should i do? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 612388) Remaining Estimate: 23h 40m (was: 23h 50m) Time Spent: 20m (was: 10m) > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 20m > Remaining Estimate: 23h 40m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_131] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) > ~[?:1.8.0_131] at > org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154) >
[jira] [Work logged] (GRIFFIN-363) Hive kerberos for Griffin error
[ https://issues.apache.org/jira/browse/GRIFFIN-363?focusedWorklogId=611460=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611460 ] ASF GitHub Bot logged work on GRIFFIN-363: -- Author: ASF GitHub Bot Created on: 15/Jun/21 17:00 Start Date: 15/Jun/21 17:00 Worklog Time Spent: 10m Work Description: amazingSaltFish opened a new pull request #592: URL: https://github.com/apache/griffin/pull/592 url:https://issues.apache.org/jira/projects/GRIFFIN/issues/GRIFFIN-363?filter=allopenissues 1. add new property hive.krb5conf.path. 2. solve hive metastore kerberos error. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 611460) Remaining Estimate: 23h 50m (was: 24h) Time Spent: 10m > Hive kerberos for Griffin error > --- > > Key: GRIFFIN-363 > URL: https://issues.apache.org/jira/browse/GRIFFIN-363 > Project: Griffin > Issue Type: Bug > Components: Service Module >Affects Versions: 0.6.0 > Environment: CentOS 7 >Reporter: MenghuiWan >Priority: Minor > Labels: kerberos > Original Estimate: 24h > Time Spent: 10m > Remaining Estimate: 23h 50m > > i am sorry for my english is not well. > > This is my problem: > i try to use griffin for data quality projects. > our enviroment is CDH 6.1.1 cluster with kerberos. > i want to connect hive then i set up all configurations about kerberos by > user guide. > but i found hive connection had failed. > > error log is: > {quote}2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 WARN 102938 — [ main] h.metastore > [487] : Failed to connect to the MetaStore > Server...2021-06-15 13:33:01.418 INFO 102938 — [ main] h.metastore > [518] : Waiting 1 seconds before next connection > attempt.2021-06-15 13:33:02.419 INFO 102938 — [ main] h.metastore > [434] : Trying to connect to metastore with URI > thrift://..com:90832021-06-15 13:33:02.422 ERROR 102938 — [ > main] o.a.t.t.TSaslTransport [315] : SASL negotiation > failure > javax.security.sasl.SaslException: GSS initiate failed at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_131] at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[libthrift-0.9.3.jar:0.9.3] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_131] at > javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_131] at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:?] at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > ~[hive-shims-common-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:478) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:286) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:211) > ~[hive-metastore-2.2.0.jar:2.2.0] at > org.apache.griffin.core.metastore.hive.HiveMetaStoreProxy.initHiveMetastoreClient(HiveMetaStoreProxy.java:68) > ~[service-0.7.0-SNAPSHOT.jar:0.7.0-SNAPSHOT] at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_131] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_131] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498) > ~[?:1.8.0_131] at >
[jira] [Work logged] (GRIFFIN-358) Rewrite the Rule/Measure implementations
[ https://issues.apache.org/jira/browse/GRIFFIN-358?focusedWorklogId=603701=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603701 ] ASF GitHub Bot logged work on GRIFFIN-358: -- Author: ASF GitHub Bot Created on: 28/May/21 20:51 Start Date: 28/May/21 20:51 Worklog Time Spent: 10m Work Description: chitralverma opened a new pull request #591: URL: https://github.com/apache/griffin/pull/591 **What changes were proposed in this pull request?** Current `RuleParams` can be of the following 3 DSL types, - Data Ops (for source preprocessing) - Griffin DSL - SparkSQL GriffinDSL allows the implementation of measures (DQ Types) like Completeness, Accuracy, etc. To enable such measures there is an extensive implementation of expression, task hierarchies, parsing and most of this is heavily dependent on scala-parser-combinators. At the end of the implementation, Griffin DSL tries to mimic a SparkSQL-like query but substitution of user-defined constraints. This approach has some drawbacks, - Suboptimal processing. While the transformation steps execute in parallel on the driver, the data set is still scanned multiple times in parallel which can cause inefficiencies on the SparkSession side and the internal task scheduler was single-threaded. Even though the data set can be cached, still it branched and crucial memory is required for holding the dataset rather than processing it. - Internal functions of Spark are not used. Data preprocessing has a very limited scope currently even though we have 100s spark SQL functions available for use. - This blocks structured streaming. The manually constructed SQL queries cause multiple aggregations in the same query on a streaming data set which is not supported by Spark's Structured streaming. There are workarounds for this but they all require rewriting the *Expr2DQSteps classes. - Griffin DSL is SparkSQL like but not 100% compatible. Profiling measure and SparkSQL are redundant functionalities The proposed solution involves SparkSQL DSL based measures and some changes to Rule Params. This will enhance the data pre proc flows and the measures themselves **Does this PR introduce any user-facing change?** Yes. Users can use the new measures as a separate configuration and there is scope for more data pre-processing. **How was this patch tested?** Unit Tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 603701) Remaining Estimate: 0h Time Spent: 10m > Rewrite the Rule/Measure implementations > > > Key: GRIFFIN-358 > URL: https://issues.apache.org/jira/browse/GRIFFIN-358 > Project: Griffin > Issue Type: New Feature >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Current `RuleParams` can be of the following 3 DSL types, > * Data Ops (for source preprocessing) > * Griffin DSL > * SparkSQL > GriffinDSL allows the implementation of measures (DQ Types) like > Completeness, Accuracy, etc. > To enable such measures there is an extensive implementation of expression, > task hierarchies, parsing and most of this is heavily dependent on > scala-parser-combinators. > At the end of the implementation, Griffin DSL tries to mimic a SparkSQL-like > query but substitution of user-defined constraints. > This approach has some drawbacks, > * Suboptimal processing. While the transformation steps execute in parallel > on the driver, the data set is still scanned multiple times in parallel which > can cause inefficiencies on the SparkSession side and the internal task > scheduler was single-threaded. Even though the data set can be cached, still > it branched and crucial memory is required for holding the dataset rather > than processing it. > * Internal functions of Spark are not used. Data preprocessing has a very > limited scope currently even though we have 100s spark SQL functions > available for use. > * This blocks structured streaming. The manually constructed SQL queries > cause multiple aggregations in the same query on a streaming data set which > is not supported by Spark's Structured streaming. There are workarounds for > this but they all require rewriting the *Expr2DQSteps classes. > * Griffin DSL is SparkSQL like but not 100% compatible. Profiling measure > and SparkSQL are redundant functionalities > The proposed
[jira] [Work logged] (GRIFFIN-347) Setup automated workflows
[ https://issues.apache.org/jira/browse/GRIFFIN-347?focusedWorklogId=586754=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-586754 ] ASF GitHub Bot logged work on GRIFFIN-347: -- Author: ASF GitHub Bot Created on: 21/Apr/21 17:16 Start Date: 21/Apr/21 17:16 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #583: URL: https://github.com/apache/griffin/pull/583#issuecomment-824225116 @guoyuepeng @wankunde can you review this? I have updated the scope of this ticket. The automation now focuses only on PRs now since Jira handles all the issues for Apache projects. We can ideally close the Jira tickets from this automation as well but that will involve the creation of tokens and setting them like secrets in the repo settings. Also, that automation should be set up directly in Jira as Github won't be able to track the activity on a Jira ticket from here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 586754) Time Spent: 1h 10m (was: 1h) > Setup automated workflows > - > > Key: GRIFFIN-347 > URL: https://issues.apache.org/jira/browse/GRIFFIN-347 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Trivial > Original Estimate: 10m > Time Spent: 1h 10m > Remaining Estimate: 0h > > This ticket aims to set up the following 2 automation in Github. > *Check for Stale PRs/ Issues* > Add a GitHub Workflow that automatically checks for stale PRs and tags/ > closes them when inactive for a long duration. > {quote}This workflow will run every day at 00.00 UTC to check for any issues/ > PRs that have been inactive for over 30 and will close them in another 15 > days. > Additionally, the stale PRs will be marked with a {{no-pr-activity}} label. > PR s having {{awaiting-approval}}, {{work-in-progress}} or {{wip}} label are > excluded from this check. > {quote} > > *Greet new users* > {quote}Add a GitHub Workflow that automatically greets new users on their > first PR. > {quote} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-360) Improve merge_pr.py script
[ https://issues.apache.org/jira/browse/GRIFFIN-360?focusedWorklogId=586629=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-586629 ] ASF GitHub Bot logged work on GRIFFIN-360: -- Author: ASF GitHub Bot Created on: 21/Apr/21 14:46 Start Date: 21/Apr/21 14:46 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #590: URL: https://github.com/apache/griffin/pull/590#issuecomment-824120086 @guoyuepeng @wankunde can you review this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 586629) Remaining Estimate: 40m (was: 50m) Time Spent: 20m (was: 10m) > Improve merge_pr.py script > -- > > Key: GRIFFIN-360 > URL: https://issues.apache.org/jira/browse/GRIFFIN-360 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Priority: Major > Original Estimate: 1h > Time Spent: 20m > Remaining Estimate: 40m > > The merge_pr.py script can be improved with many good-to-have changes like > below, > * allow python 3 compatibility > * better check for Jira dependency > * Updating Jira with more details like assignee and contributor details > * upgrading dependencies > Also added a requirements.txt file for installation of script dependencies. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-360) Improve merge_pr.py script
[ https://issues.apache.org/jira/browse/GRIFFIN-360?focusedWorklogId=586627=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-586627 ] ASF GitHub Bot logged work on GRIFFIN-360: -- Author: ASF GitHub Bot Created on: 21/Apr/21 14:45 Start Date: 21/Apr/21 14:45 Worklog Time Spent: 10m Work Description: chitralverma opened a new pull request #590: URL: https://github.com/apache/griffin/pull/590 **What changes were proposed in this pull request?** The merge_pr.py script can be improved with many good-to-have changes like below, - allow python 3 compatibility - better check for Jira dependency - Updating Jira with more details like assignee and contributor details - upgrading dependencies Also added a requirements.txt file for installation of script dependencies. **Does this PR introduce any user-facing change?** No. Committers will use this script to merge changes. **How was this patch tested?** In sync with the Spark merge script. used this to merge previous PRs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 586627) Remaining Estimate: 50m (was: 1h) Time Spent: 10m > Improve merge_pr.py script > -- > > Key: GRIFFIN-360 > URL: https://issues.apache.org/jira/browse/GRIFFIN-360 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Priority: Major > Original Estimate: 1h > Time Spent: 10m > Remaining Estimate: 50m > > The merge_pr.py script can be improved with many good-to-have changes like > below, > * allow python 3 compatibility > * better check for Jira dependency > * Updating Jira with more details like assignee and contributor details > * upgrading dependencies > Also added a requirements.txt file for installation of script dependencies. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-345) Support cross-version compilation for Scala and other dependencies
[ https://issues.apache.org/jira/browse/GRIFFIN-345?focusedWorklogId=562925=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-562925 ] ASF GitHub Bot logged work on GRIFFIN-345: -- Author: ASF GitHub Bot Created on: 09/Mar/21 09:03 Start Date: 09/Mar/21 09:03 Worklog Time Spent: 10m Work Description: asfgit closed pull request #589: URL: https://github.com/apache/griffin/pull/589 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 562925) Time Spent: 1.5h (was: 1h 20m) > Support cross-version compilation for Scala and other dependencies > -- > > Key: GRIFFIN-345 > URL: https://issues.apache.org/jira/browse/GRIFFIN-345 > Project: Griffin > Issue Type: Improvement >Reporter: Tushar >Assignee: Chitral Verma >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > Following topics are covered in this ticket, > * Cross-compilation across scala major versions (2.11, 2.12 and 2.13) > * Update Spark Version (2.4+) > * Explore maven profiles to execute on both Vanilla HDFS and AWS EMR (these > profiles can be extended to support GCP etc.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-345) Support cross-version compilation for Scala and other dependencies
[ https://issues.apache.org/jira/browse/GRIFFIN-345?focusedWorklogId=560537=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560537 ] ASF GitHub Bot logged work on GRIFFIN-345: -- Author: ASF GitHub Bot Created on: 03/Mar/21 16:03 Start Date: 03/Mar/21 16:03 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #589: URL: https://github.com/apache/griffin/pull/589#issuecomment-789824308 Great, I'll merge this then when I'm back working next. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560537) Time Spent: 1h 20m (was: 1h 10m) > Support cross-version compilation for Scala and other dependencies > -- > > Key: GRIFFIN-345 > URL: https://issues.apache.org/jira/browse/GRIFFIN-345 > Project: Griffin > Issue Type: Improvement >Reporter: Tushar >Assignee: Chitral Verma >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Following topics are covered in this ticket, > * Cross-compilation across scala major versions (2.11, 2.12 and 2.13) > * Update Spark Version (2.4+) > * Explore maven profiles to execute on both Vanilla HDFS and AWS EMR (these > profiles can be extended to support GCP etc.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-345) Support cross-version compilation for Scala and other dependencies
[ https://issues.apache.org/jira/browse/GRIFFIN-345?focusedWorklogId=560347=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560347 ] ASF GitHub Bot logged work on GRIFFIN-345: -- Author: ASF GitHub Bot Created on: 03/Mar/21 08:31 Start Date: 03/Mar/21 08:31 Worklog Time Spent: 10m Work Description: guoyuepeng commented on pull request #589: URL: https://github.com/apache/griffin/pull/589#issuecomment-789536973 LGTM! Thanks guys. @chitralverma @wankunde This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560347) Time Spent: 1h 10m (was: 1h) > Support cross-version compilation for Scala and other dependencies > -- > > Key: GRIFFIN-345 > URL: https://issues.apache.org/jira/browse/GRIFFIN-345 > Project: Griffin > Issue Type: Improvement >Reporter: Tushar >Assignee: Chitral Verma >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Following topics are covered in this ticket, > * Cross-compilation across scala major versions (2.11, 2.12 and 2.13) > * Update Spark Version (2.4+) > * Explore maven profiles to execute on both Vanilla HDFS and AWS EMR (these > profiles can be extended to support GCP etc.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-345) Support cross-version compilation for Scala and other dependencies
[ https://issues.apache.org/jira/browse/GRIFFIN-345?focusedWorklogId=560303=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560303 ] ASF GitHub Bot logged work on GRIFFIN-345: -- Author: ASF GitHub Bot Created on: 03/Mar/21 06:33 Start Date: 03/Mar/21 06:33 Worklog Time Spent: 10m Work Description: wankunde commented on pull request #589: URL: https://github.com/apache/griffin/pull/589#issuecomment-789474905 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560303) Time Spent: 1h (was: 50m) > Support cross-version compilation for Scala and other dependencies > -- > > Key: GRIFFIN-345 > URL: https://issues.apache.org/jira/browse/GRIFFIN-345 > Project: Griffin > Issue Type: Improvement >Reporter: Tushar >Assignee: Chitral Verma >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Following topics are covered in this ticket, > * Cross-compilation across scala major versions (2.11, 2.12 and 2.13) > * Update Spark Version (2.4+) > * Explore maven profiles to execute on both Vanilla HDFS and AWS EMR (these > profiles can be extended to support GCP etc.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-345) Support cross-version compilation for Scala and other dependencies
[ https://issues.apache.org/jira/browse/GRIFFIN-345?focusedWorklogId=560070=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560070 ] ASF GitHub Bot logged work on GRIFFIN-345: -- Author: ASF GitHub Bot Created on: 02/Mar/21 17:52 Start Date: 02/Mar/21 17:52 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #589: URL: https://github.com/apache/griffin/pull/589#issuecomment-789093549 @wankunde @guoyuepeng Just added support for cross version build against Spark 3.0.2 as well ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560070) Time Spent: 50m (was: 40m) > Support cross-version compilation for Scala and other dependencies > -- > > Key: GRIFFIN-345 > URL: https://issues.apache.org/jira/browse/GRIFFIN-345 > Project: Griffin > Issue Type: Improvement >Reporter: Tushar >Assignee: Chitral Verma >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Following topics are covered in this ticket, > * Cross-compilation across scala major versions (2.11, 2.12 and 2.13) > * Update Spark Version (2.4+) > * Explore maven profiles to execute on both Vanilla HDFS and AWS EMR (these > profiles can be extended to support GCP etc.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-345) Support cross-version compilation for Scala and other dependencies
[ https://issues.apache.org/jira/browse/GRIFFIN-345?focusedWorklogId=560010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-560010 ] ASF GitHub Bot logged work on GRIFFIN-345: -- Author: ASF GitHub Bot Created on: 02/Mar/21 15:39 Start Date: 02/Mar/21 15:39 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #589: URL: https://github.com/apache/griffin/pull/589#issuecomment-788999767 @wankunde spark 3 has some interface changes but let me check This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 560010) Time Spent: 40m (was: 0.5h) > Support cross-version compilation for Scala and other dependencies > -- > > Key: GRIFFIN-345 > URL: https://issues.apache.org/jira/browse/GRIFFIN-345 > Project: Griffin > Issue Type: Improvement >Reporter: Tushar >Assignee: Chitral Verma >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Following topics are covered in this ticket, > * Cross-compilation across scala major versions (2.11, 2.12 and 2.13) > * Update Spark Version (2.4+) > * Explore maven profiles to execute on both Vanilla HDFS and AWS EMR (these > profiles can be extended to support GCP etc.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-345) Support cross-version compilation for Scala and other dependencies
[ https://issues.apache.org/jira/browse/GRIFFIN-345?focusedWorklogId=559984=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-559984 ] ASF GitHub Bot logged work on GRIFFIN-345: -- Author: ASF GitHub Bot Created on: 02/Mar/21 14:37 Start Date: 02/Mar/21 14:37 Worklog Time Spent: 10m Work Description: wankunde commented on pull request #589: URL: https://github.com/apache/griffin/pull/589#issuecomment-788953867 @chitralverma Hi, chitralverma , I think it's very useful to support cross-version compilation for Scala and Spark dependencies. Since spark 3 has been released for some time, can we support it at the same time? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 559984) Time Spent: 0.5h (was: 20m) > Support cross-version compilation for Scala and other dependencies > -- > > Key: GRIFFIN-345 > URL: https://issues.apache.org/jira/browse/GRIFFIN-345 > Project: Griffin > Issue Type: Improvement >Reporter: Tushar >Assignee: Chitral Verma >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Following topics are covered in this ticket, > * Cross-compilation across scala major versions (2.11, 2.12 and 2.13) > * Update Spark Version (2.4+) > * Explore maven profiles to execute on both Vanilla HDFS and AWS EMR (these > profiles can be extended to support GCP etc.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-345) Support cross-version compilation for Scala and other dependencies
[ https://issues.apache.org/jira/browse/GRIFFIN-345?focusedWorklogId=558549=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-558549 ] ASF GitHub Bot logged work on GRIFFIN-345: -- Author: ASF GitHub Bot Created on: 26/Feb/21 12:33 Start Date: 26/Feb/21 12:33 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #589: URL: https://github.com/apache/griffin/pull/589#issuecomment-786620924 @wankunde @guoyuepeng please review this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 558549) Time Spent: 20m (was: 10m) > Support cross-version compilation for Scala and other dependencies > -- > > Key: GRIFFIN-345 > URL: https://issues.apache.org/jira/browse/GRIFFIN-345 > Project: Griffin > Issue Type: Improvement >Reporter: Tushar >Assignee: Chitral Verma >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Following topics are covered in this ticket, > * Cross-compilation across scala major versions (2.11, 2.12 and 2.13) > * Update Spark Version (2.4+) > * Explore maven profiles to execute on both Vanilla HDFS and AWS EMR (these > profiles can be extended to support GCP etc.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-345) Support cross-version compilation for Scala and other dependencies
[ https://issues.apache.org/jira/browse/GRIFFIN-345?focusedWorklogId=557740=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557740 ] ASF GitHub Bot logged work on GRIFFIN-345: -- Author: ASF GitHub Bot Created on: 25/Feb/21 06:11 Start Date: 25/Feb/21 06:11 Worklog Time Spent: 10m Work Description: chitralverma opened a new pull request #589: URL: https://github.com/apache/griffin/pull/589 **What changes were proposed in this pull request?** _This PR affects only the measure module._ In newer environments specially clouds, Griffin measure module may face compatibility issues due the old Scala and Spark versions. To remedy this following topics are covered in this ticket, - Cross-compilation across scala major versions (2.11, 2.12) - Update Spark Version (2.4+) - Create maven profiles to build different scala and spark versions - Changes to build strategy This process is also done is apache spark to build for different versions of Scala and Hadoop. **Does this PR introduce any user-facing change?** No **How was this patch tested?** Via maven build process. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 557740) Remaining Estimate: 0h Time Spent: 10m > Support cross-version compilation for Scala and other dependencies > -- > > Key: GRIFFIN-345 > URL: https://issues.apache.org/jira/browse/GRIFFIN-345 > Project: Griffin > Issue Type: Improvement >Reporter: Tushar >Assignee: Chitral Verma >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Following topics are covered in this ticket, > * Cross-compilation across scala major versions (2.11, 2.12 and 2.13) > * Update Spark Version (2.4+) > * Explore maven profiles to execute on both Vanilla HDFS and AWS EMR (these > profiles can be extended to support GCP etc.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-347) Setup automated workflows
[ https://issues.apache.org/jira/browse/GRIFFIN-347?focusedWorklogId=491777=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491777 ] ASF GitHub Bot logged work on GRIFFIN-347: -- Author: ASF GitHub Bot Created on: 28/Sep/20 00:55 Start Date: 28/Sep/20 00:55 Worklog Time Spent: 10m Work Description: guoyuepeng commented on pull request #583: URL: https://github.com/apache/griffin/pull/583#issuecomment-699715572 > > @chitralverma > > Good to know this feature, have you tested it on our jira? > > Thanks, > > William > > Hi William, the bot only closes the issues and PRs on Github. Do you want me to update this to close the tickets automatically as well? Cool, please close the ticket at the same time. BTW, when you say close the issue, do you mean close JIRA tickets? Where are those issues documented? Thanks, William This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 491777) Time Spent: 1h (was: 50m) > Setup automated workflows > - > > Key: GRIFFIN-347 > URL: https://issues.apache.org/jira/browse/GRIFFIN-347 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Trivial > Original Estimate: 10m > Time Spent: 1h > Remaining Estimate: 0h > > *Check for Stale PRs/ Issues* > Add a GitHub Workflow that automatically checks for stale PRs and Issues and > tags/ closes them when inactive for a long duration. > {quote}This workflow will run every day at 00.00 UTC to check for any issues/ > PRs that have been inactive for over 30 and will close them in another 15 > days. > Additionally, the stale issues will be marked with `stale-issue` label and > the stale PRs will be marked with `stale-pr` label. Issues/ PR having > `awaiting-approval` or `work-in-progress` labels are excluded from this check. > {quote} > *Greet new users* > Add a GitHub Workflow that automatically greets new users on their first PR/ > Issue. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-347) Setup automated workflows
[ https://issues.apache.org/jira/browse/GRIFFIN-347?focusedWorklogId=489536=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-489536 ] ASF GitHub Bot logged work on GRIFFIN-347: -- Author: ASF GitHub Bot Created on: 23/Sep/20 11:38 Start Date: 23/Sep/20 11:38 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #583: URL: https://github.com/apache/griffin/pull/583#issuecomment-697307471 > @chitralverma > > Good to know this feature, have you tested it on our jira? > > Thanks, > William Hi William, the bot only closes the issues and PRs on Github. Do you want me to update this to close the tickets automatically as well? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 489536) Time Spent: 50m (was: 40m) > Setup automated workflows > - > > Key: GRIFFIN-347 > URL: https://issues.apache.org/jira/browse/GRIFFIN-347 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Trivial > Original Estimate: 10m > Time Spent: 50m > Remaining Estimate: 0h > > *Check for Stale PRs/ Issues* > Add a GitHub Workflow that automatically checks for stale PRs and Issues and > tags/ closes them when inactive for a long duration. > {quote}This workflow will run every day at 00.00 UTC to check for any issues/ > PRs that have been inactive for over 30 and will close them in another 15 > days. > Additionally, the stale issues will be marked with `stale-issue` label and > the stale PRs will be marked with `stale-pr` label. Issues/ PR having > `awaiting-approval` or `work-in-progress` labels are excluded from this check. > {quote} > *Greet new users* > Add a GitHub Workflow that automatically greets new users on their first PR/ > Issue. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-347) Setup automated workflows
[ https://issues.apache.org/jira/browse/GRIFFIN-347?focusedWorklogId=488805=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-488805 ] ASF GitHub Bot logged work on GRIFFIN-347: -- Author: ASF GitHub Bot Created on: 23/Sep/20 04:11 Start Date: 23/Sep/20 04:11 Worklog Time Spent: 10m Work Description: guoyuepeng commented on pull request #583: URL: https://github.com/apache/griffin/pull/583#issuecomment-697044695 @chitralverma Good to know this feature, have you tested it on our jira? Thanks, William This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 488805) Time Spent: 40m (was: 0.5h) > Setup automated workflows > - > > Key: GRIFFIN-347 > URL: https://issues.apache.org/jira/browse/GRIFFIN-347 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Trivial > Original Estimate: 10m > Time Spent: 40m > Remaining Estimate: 0h > > *Check for Stale PRs/ Issues* > Add a GitHub Workflow that automatically checks for stale PRs and Issues and > tags/ closes them when inactive for a long duration. > {quote}This workflow will run every day at 00.00 UTC to check for any issues/ > PRs that have been inactive for over 30 and will close them in another 15 > days. > Additionally, the stale issues will be marked with `stale-issue` label and > the stale PRs will be marked with `stale-pr` label. Issues/ PR having > `awaiting-approval` or `work-in-progress` labels are excluded from this check. > {quote} > *Greet new users* > Add a GitHub Workflow that automatically greets new users on their first PR/ > Issue. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-347) Setup automated workflows
[ https://issues.apache.org/jira/browse/GRIFFIN-347?focusedWorklogId=487429=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487429 ] ASF GitHub Bot logged work on GRIFFIN-347: -- Author: ASF GitHub Bot Created on: 22/Sep/20 03:00 Start Date: 22/Sep/20 03:00 Worklog Time Spent: 10m Work Description: chitralverma opened a new pull request #583: URL: https://github.com/apache/griffin/pull/583 **What changes were proposed in this pull request?** Add a GitHub Workflow that automatically checks for stale PRs and Issues and tags/ closes them when inactive for a long duration. > This workflow will run every day at 00.00 UTC to check for any issues/ PRs that have been inactive for over 30 and will close them in another 15 days. > Additionally, the stale issues will be marked with `stale-issue` label and the stale PRs will be marked with `stale-pr` label. Issues/ PR having `awaiting-approval` or `work-in-progress` labels are excluded from this check. Greet new users Add a GitHub Workflow that automatically greets new users on their first PR/ Issue. **Does this PR introduce any user-facing change?** Yes. Users will see comments on PRs and Issues **How was this patch tested?** No test required This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487429) Time Spent: 20m (was: 10m) > Setup automated workflows > - > > Key: GRIFFIN-347 > URL: https://issues.apache.org/jira/browse/GRIFFIN-347 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Trivial > Original Estimate: 10m > Time Spent: 20m > Remaining Estimate: 0h > > *Check for Stale PRs/ Issues* > Add a GitHub Workflow that automatically checks for stale PRs and Issues and > tags/ closes them when inactive for a long duration. > {quote}This workflow will run every day at 00.00 UTC to check for any issues/ > PRs that have been inactive for over 30 and will close them in another 15 > days. > Additionally, the stale issues will be marked with `stale-issue` label and > the stale PRs will be marked with `stale-pr` label. Issues/ PR having > `awaiting-approval` or `work-in-progress` labels are excluded from this check. > {quote} > *Greet new users* > Add a GitHub Workflow that automatically greets new users on their first PR/ > Issue. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-347) Setup automated workflows
[ https://issues.apache.org/jira/browse/GRIFFIN-347?focusedWorklogId=486764=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486764 ] ASF GitHub Bot logged work on GRIFFIN-347: -- Author: ASF GitHub Bot Created on: 21/Sep/20 05:11 Start Date: 21/Sep/20 05:11 Worklog Time Spent: 10m Work Description: chitralverma opened a new pull request #583: URL: https://github.com/apache/griffin/pull/583 **What changes were proposed in this pull request?** Add a GitHub Workflow that automatically checks for stale PRs and Issues and tags/ closes them when inactive for a long duration. > This workflow will run every day at 00.00 UTC to check for any issues/ PRs that have been inactive for over 30 and will close them in another 15 days. > Additionally, the stale issues will be marked with `stale-issue` label and the stale PRs will be marked with `stale-pr` label. Issues/ PR having `awaiting-approval` or `work-in-progress` labels are excluded from this check. Greet new users Add a GitHub Workflow that automatically greets new users on their first PR/ Issue. **Does this PR introduce any user-facing change?** Yes. Users will see comments on PRs and Issues **How was this patch tested?** No test required This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 486764) Remaining Estimate: 0h (was: 10m) Time Spent: 10m > Setup automated workflows > - > > Key: GRIFFIN-347 > URL: https://issues.apache.org/jira/browse/GRIFFIN-347 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Trivial > Original Estimate: 10m > Time Spent: 10m > Remaining Estimate: 0h > > *Check for Stale PRs/ Issues* > Add a GitHub Workflow that automatically checks for stale PRs and Issues and > tags/ closes them when inactive for a long duration. > {quote}This workflow will run every day at 00.00 UTC to check for any issues/ > PRs that have been inactive for over 30 and will close them in another 15 > days. > Additionally, the stale issues will be marked with `stale-issue` label and > the stale PRs will be marked with `stale-pr` label. Issues/ PR having > `awaiting-approval` or `work-in-progress` labels are excluded from this check. > {quote} > *Greet new users* > Add a GitHub Workflow that automatically greets new users on their first PR/ > Issue. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-305) Standardize Sink Hierarchy
[ https://issues.apache.org/jira/browse/GRIFFIN-305?focusedWorklogId=468374=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-468374 ] ASF GitHub Bot logged work on GRIFFIN-305: -- Author: ASF GitHub Bot Created on: 10/Aug/20 02:53 Start Date: 10/Aug/20 02:53 Worklog Time Spent: 10m Work Description: guoyuepeng commented on pull request #575: URL: https://github.com/apache/griffin/pull/575#issuecomment-671141520 @chitralverma The merge process as following: use python 2.7 - run ./merge_pr.py - Which pull request would you like to merge? (e.g. 34): 575 - select 575 - Proceed with merging pull request #575? (y/n): y - Merge complete (local ref PR_TOOL_MERGE_PR_575_MASTER). Push to apache-git? (y/n): y - Would you like to pick 1aa8995a into another branch? (y/n): n - Would you like to update an associated JIRA? (y/n): y - Enter comma-separated fix version(s) [0.6.0]: You should have permission for this. tell me if you encounter any problem. Thanks, William This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 468374) Time Spent: 2h 10m (was: 2h) > Standardize Sink Hierarchy > -- > > Key: GRIFFIN-305 > URL: https://issues.apache.org/jira/browse/GRIFFIN-305 > Project: Griffin > Issue Type: Sub-task >Reporter: Chitral Verma >Priority: Major > Fix For: 0.6.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-305) Standardize Sink Hierarchy
[ https://issues.apache.org/jira/browse/GRIFFIN-305?focusedWorklogId=468373=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-468373 ] ASF GitHub Bot logged work on GRIFFIN-305: -- Author: ASF GitHub Bot Created on: 10/Aug/20 02:50 Start Date: 10/Aug/20 02:50 Worklog Time Spent: 10m Work Description: asfgit closed pull request #575: URL: https://github.com/apache/griffin/pull/575 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 468373) Time Spent: 2h (was: 1h 50m) > Standardize Sink Hierarchy > -- > > Key: GRIFFIN-305 > URL: https://issues.apache.org/jira/browse/GRIFFIN-305 > Project: Griffin > Issue Type: Sub-task >Reporter: Chitral Verma >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-305) Standardize Sink Hierarchy
[ https://issues.apache.org/jira/browse/GRIFFIN-305?focusedWorklogId=467097=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-467097 ] ASF GitHub Bot logged work on GRIFFIN-305: -- Author: ASF GitHub Bot Created on: 06/Aug/20 06:41 Start Date: 06/Aug/20 06:41 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #575: URL: https://github.com/apache/griffin/pull/575#issuecomment-669736481 absolutely, I'm all in for Griffin. :) @wankunde can you please merge this. Also, can you tell me how the requests are merged for this project so that I can help close some of the open PRs. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 467097) Time Spent: 1h 50m (was: 1h 40m) > Standardize Sink Hierarchy > -- > > Key: GRIFFIN-305 > URL: https://issues.apache.org/jira/browse/GRIFFIN-305 > Project: Griffin > Issue Type: Sub-task >Reporter: Chitral Verma >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-305) Standardize Sink Hierarchy
[ https://issues.apache.org/jira/browse/GRIFFIN-305?focusedWorklogId=467088=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-467088 ] ASF GitHub Bot logged work on GRIFFIN-305: -- Author: ASF GitHub Bot Created on: 06/Aug/20 06:20 Start Date: 06/Aug/20 06:20 Worklog Time Spent: 10m Work Description: wankunde commented on pull request #575: URL: https://github.com/apache/griffin/pull/575#issuecomment-669727067 LGTM, @chitralverma Many thanks for your work. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 467088) Time Spent: 1h 40m (was: 1.5h) > Standardize Sink Hierarchy > -- > > Key: GRIFFIN-305 > URL: https://issues.apache.org/jira/browse/GRIFFIN-305 > Project: Griffin > Issue Type: Sub-task >Reporter: Chitral Verma >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-323) Refactor configuration for Data Source and Data Source Connector
[ https://issues.apache.org/jira/browse/GRIFFIN-323?focusedWorklogId=464011=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-464011 ] ASF GitHub Bot logged work on GRIFFIN-323: -- Author: ASF GitHub Bot Created on: 29/Jul/20 07:53 Start Date: 29/Jul/20 07:53 Worklog Time Spent: 10m Work Description: zgdong1987 commented on pull request #568: URL: https://github.com/apache/griffin/pull/568#issuecomment-665456611 I changed to the version 0.5.0, but failed to compile the UI This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 464011) Time Spent: 1h 50m (was: 1h 40m) > Refactor configuration for Data Source and Data Source Connector > > > Key: GRIFFIN-323 > URL: https://issues.apache.org/jira/browse/GRIFFIN-323 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Major > Fix For: 0.6.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Current config structure for Data Source is as follows, > {noformat} > > "data.sources": [ > { > "name": "src", > "connectors": [ > { > "type": "AVRO", > "version": "1.7", > "config": { > "file.path": "/", > "file.name": ".avro" > } > } > ] > }, > { > "name": "tgt", > "connectors": [ > { > "type": "AVRO", > "version": "1.7", > "config": { > "file.path": "/", > "file.name": ".avro" > } > } > ] > } > ] > {noformat} > > This ticket proposes the following changes, > * remove 'version' from 'DataConnectorParam' as it is not being used > anywhere in the codebase. > * change 'connectors' from array type to a single JSON object. Since a data > source named X may only be of one type (hive, file etc), the connector field > should not be an array. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-323) Refactor configuration for Data Source and Data Source Connector
[ https://issues.apache.org/jira/browse/GRIFFIN-323?focusedWorklogId=459845=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-459845 ] ASF GitHub Bot logged work on GRIFFIN-323: -- Author: ASF GitHub Bot Created on: 16/Jul/20 15:04 Start Date: 16/Jul/20 15:04 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #568: URL: https://github.com/apache/griffin/pull/568#issuecomment-659472645 Hi Everyone, Please note that these changes are done in the `measure` module only. To get a more stable release, use the version 0.5.0 available in the maven central. Otherwise, fix PRs for `services` and `ui` module are most welcome. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 459845) Time Spent: 1.5h (was: 1h 20m) > Refactor configuration for Data Source and Data Source Connector > > > Key: GRIFFIN-323 > URL: https://issues.apache.org/jira/browse/GRIFFIN-323 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Major > Fix For: 0.6.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Current config structure for Data Source is as follows, > {noformat} > > "data.sources": [ > { > "name": "src", > "connectors": [ > { > "type": "AVRO", > "version": "1.7", > "config": { > "file.path": "/", > "file.name": ".avro" > } > } > ] > }, > { > "name": "tgt", > "connectors": [ > { > "type": "AVRO", > "version": "1.7", > "config": { > "file.path": "/", > "file.name": ".avro" > } > } > ] > } > ] > {noformat} > > This ticket proposes the following changes, > * remove 'version' from 'DataConnectorParam' as it is not being used > anywhere in the codebase. > * change 'connectors' from array type to a single JSON object. Since a data > source named X may only be of one type (hive, file etc), the connector field > should not be an array. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-323) Refactor configuration for Data Source and Data Source Connector
[ https://issues.apache.org/jira/browse/GRIFFIN-323?focusedWorklogId=459832=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-459832 ] ASF GitHub Bot logged work on GRIFFIN-323: -- Author: ASF GitHub Bot Created on: 16/Jul/20 14:03 Start Date: 16/Jul/20 14:03 Worklog Time Spent: 10m Work Description: FaizanSh commented on pull request #568: URL: https://github.com/apache/griffin/pull/568#issuecomment-659433895 > Yes these changes are not supported by the current ui and services Hi, this problem still presist in code. What workaround do you suggest? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 459832) Time Spent: 1h 20m (was: 1h 10m) > Refactor configuration for Data Source and Data Source Connector > > > Key: GRIFFIN-323 > URL: https://issues.apache.org/jira/browse/GRIFFIN-323 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Major > Fix For: 0.6.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Current config structure for Data Source is as follows, > {noformat} > > "data.sources": [ > { > "name": "src", > "connectors": [ > { > "type": "AVRO", > "version": "1.7", > "config": { > "file.path": "/", > "file.name": ".avro" > } > } > ] > }, > { > "name": "tgt", > "connectors": [ > { > "type": "AVRO", > "version": "1.7", > "config": { > "file.path": "/", > "file.name": ".avro" > } > } > ] > } > ] > {noformat} > > This ticket proposes the following changes, > * remove 'version' from 'DataConnectorParam' as it is not being used > anywhere in the codebase. > * change 'connectors' from array type to a single JSON object. Since a data > source named X may only be of one type (hive, file etc), the connector field > should not be an array. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-323) Refactor configuration for Data Source and Data Source Connector
[ https://issues.apache.org/jira/browse/GRIFFIN-323?focusedWorklogId=452464=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-452464 ] ASF GitHub Bot logged work on GRIFFIN-323: -- Author: ASF GitHub Bot Created on: 29/Jun/20 16:31 Start Date: 29/Jun/20 16:31 Worklog Time Spent: 10m Work Description: Rayleigh0727 commented on pull request #568: URL: https://github.com/apache/griffin/pull/568#issuecomment-651229496 > I do not see a change in the service code to accept this change in contract. When I submit with 'connectors' from the UI the request goes through but fails while executing the job (log snippet below) and with 'connector' does not work in the service API call since it's expecting 'connectors' to submit a measure. Am I missing something here? > `20/03/19 18:27:26 ERROR Application$: assertion failed: Connector is undefined or invalid java.lang.AssertionError: assertion failed: Connector is undefined or invalid at scala.Predef$.assert(Predef.scala:170) at org.apache.griffin.measure.configuration.dqdefinition.DataSourceParam.validate(DQConfig.scala:100)` hi,I encountered the same problem, did you solve it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 452464) Time Spent: 1h 10m (was: 1h) > Refactor configuration for Data Source and Data Source Connector > > > Key: GRIFFIN-323 > URL: https://issues.apache.org/jira/browse/GRIFFIN-323 > Project: Griffin > Issue Type: Improvement >Reporter: Chitral Verma >Assignee: Chitral Verma >Priority: Major > Fix For: 0.6.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Current config structure for Data Source is as follows, > {noformat} > > "data.sources": [ > { > "name": "src", > "connectors": [ > { > "type": "AVRO", > "version": "1.7", > "config": { > "file.path": "/", > "file.name": ".avro" > } > } > ] > }, > { > "name": "tgt", > "connectors": [ > { > "type": "AVRO", > "version": "1.7", > "config": { > "file.path": "/", > "file.name": ".avro" > } > } > ] > } > ] > {noformat} > > This ticket proposes the following changes, > * remove 'version' from 'DataConnectorParam' as it is not being used > anywhere in the codebase. > * change 'connectors' from array type to a single JSON object. Since a data > source named X may only be of one type (hive, file etc), the connector field > should not be an array. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-305) Standardize Sink Hierarchy
[ https://issues.apache.org/jira/browse/GRIFFIN-305?focusedWorklogId=452026=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-452026 ] ASF GitHub Bot logged work on GRIFFIN-305: -- Author: ASF GitHub Bot Created on: 28/Jun/20 12:25 Start Date: 28/Jun/20 12:25 Worklog Time Spent: 10m Work Description: wankunde commented on pull request #575: URL: https://github.com/apache/griffin/pull/575#issuecomment-650745454 Hi, @chitralverma , could you provide an implementation example of the `open` and `close` methods? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 452026) Time Spent: 1h 10m (was: 1h) > Standardize Sink Hierarchy > -- > > Key: GRIFFIN-305 > URL: https://issues.apache.org/jira/browse/GRIFFIN-305 > Project: Griffin > Issue Type: Sub-task >Reporter: Chitral Verma >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-305) Standardize Sink Hierarchy
[ https://issues.apache.org/jira/browse/GRIFFIN-305?focusedWorklogId=450431=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450431 ] ASF GitHub Bot logged work on GRIFFIN-305: -- Author: ASF GitHub Bot Created on: 24/Jun/20 13:17 Start Date: 24/Jun/20 13:17 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #575: URL: https://github.com/apache/griffin/pull/575#issuecomment-648813731 @wankunde the `open` and `close` methods are for future custom sinks implementations, for example, Redis, JDBC etc that do not rely on spark datasource v1/ v2. Such data sources require one-time initialization of connection/ connection pool which can then be serialized to all executor each time the write operation is called. This PR also acts as a basic cleanup for structured streaming sinks which I'm working on. I'm also planning to rewrite HDFSSink as FileBasedSink much like FileBasedDataConnector and include many other sinks. Griffin is going to get really exciting. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 450431) Time Spent: 1h (was: 50m) > Standardize Sink Hierarchy > -- > > Key: GRIFFIN-305 > URL: https://issues.apache.org/jira/browse/GRIFFIN-305 > Project: Griffin > Issue Type: Sub-task >Reporter: Chitral Verma >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-305) Standardize Sink Hierarchy
[ https://issues.apache.org/jira/browse/GRIFFIN-305?focusedWorklogId=450428=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450428 ] ASF GitHub Bot logged work on GRIFFIN-305: -- Author: ASF GitHub Bot Created on: 24/Jun/20 13:11 Start Date: 24/Jun/20 13:11 Worklog Time Spent: 10m Work Description: chitralverma commented on a change in pull request #575: URL: https://github.com/apache/griffin/pull/575#discussion_r444881499 ## File path: measure/src/main/scala/org/apache/griffin/measure/sink/Sink.scala ## @@ -18,30 +18,57 @@ package org.apache.griffin.measure.sink import org.apache.spark.rdd.RDD +import org.apache.spark.sql.DataFrame import org.apache.griffin.measure.Loggable /** - * sink metric and record + * Base trait for batch and Streaming Sinks. + * To implement custom sinks, extend your classes with this trait. */ trait Sink extends Loggable with Serializable { - val metricName: String + + val jobName: String Review comment: absolutely, I had the same in mind but I was planning to do it as part of a separate config refactoring in the near future. Do you suggest I do this right now or later? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 450428) Time Spent: 50m (was: 40m) > Standardize Sink Hierarchy > -- > > Key: GRIFFIN-305 > URL: https://issues.apache.org/jira/browse/GRIFFIN-305 > Project: Griffin > Issue Type: Sub-task >Reporter: Chitral Verma >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-305) Standardize Sink Hierarchy
[ https://issues.apache.org/jira/browse/GRIFFIN-305?focusedWorklogId=450426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450426 ] ASF GitHub Bot logged work on GRIFFIN-305: -- Author: ASF GitHub Bot Created on: 24/Jun/20 13:10 Start Date: 24/Jun/20 13:10 Worklog Time Spent: 10m Work Description: chitralverma commented on a change in pull request #575: URL: https://github.com/apache/griffin/pull/575#discussion_r444880634 ## File path: measure/src/main/scala/org/apache/griffin/measure/sink/Sink.scala ## @@ -18,30 +18,57 @@ package org.apache.griffin.measure.sink import org.apache.spark.rdd.RDD +import org.apache.spark.sql.DataFrame import org.apache.griffin.measure.Loggable /** - * sink metric and record + * Base trait for batch and Streaming Sinks. + * To implement custom sinks, extend your classes with this trait. */ trait Sink extends Loggable with Serializable { - val metricName: String + + val jobName: String val timeStamp: Long val config: Map[String, Any] val block: Boolean - def available(): Boolean + /** + * Ensures that the pre-requisites (if any) of the Sink are met before opening it. + */ + def validate(): Boolean - def start(msg: String): Unit - def finish(): Unit + /** + * Allows initialization of the connection to the sink (if required). + * + * @param applicationId Spark Application ID + */ + def open(applicationId: String): Unit Review comment: @wankunde this is as per the existing implementation. I just changed the variable names to remove ambiguity and made no functional change. This has been done in many other places also. I'll refactor the applicationId in favor of more description soon. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 450426) Time Spent: 40m (was: 0.5h) > Standardize Sink Hierarchy > -- > > Key: GRIFFIN-305 > URL: https://issues.apache.org/jira/browse/GRIFFIN-305 > Project: Griffin > Issue Type: Sub-task >Reporter: Chitral Verma >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-305) Standardize Sink Hierarchy
[ https://issues.apache.org/jira/browse/GRIFFIN-305?focusedWorklogId=450388=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450388 ] ASF GitHub Bot logged work on GRIFFIN-305: -- Author: ASF GitHub Bot Created on: 24/Jun/20 11:05 Start Date: 24/Jun/20 11:05 Worklog Time Spent: 10m Work Description: wankunde commented on a change in pull request #575: URL: https://github.com/apache/griffin/pull/575#discussion_r444802163 ## File path: measure/src/main/scala/org/apache/griffin/measure/sink/Sink.scala ## @@ -18,30 +18,57 @@ package org.apache.griffin.measure.sink import org.apache.spark.rdd.RDD +import org.apache.spark.sql.DataFrame import org.apache.griffin.measure.Loggable /** - * sink metric and record + * Base trait for batch and Streaming Sinks. + * To implement custom sinks, extend your classes with this trait. */ trait Sink extends Loggable with Serializable { - val metricName: String + + val jobName: String val timeStamp: Long val config: Map[String, Any] val block: Boolean - def available(): Boolean + /** + * Ensures that the pre-requisites (if any) of the Sink are met before opening it. + */ + def validate(): Boolean - def start(msg: String): Unit - def finish(): Unit + /** + * Allows initialization of the connection to the sink (if required). + * + * @param applicationId Spark Application ID + */ + def open(applicationId: String): Unit Review comment: What's the use of `applicationId `? Can we use `jobName` instead? ## File path: measure/src/main/scala/org/apache/griffin/measure/sink/Sink.scala ## @@ -18,30 +18,57 @@ package org.apache.griffin.measure.sink import org.apache.spark.rdd.RDD +import org.apache.spark.sql.DataFrame import org.apache.griffin.measure.Loggable /** - * sink metric and record + * Base trait for batch and Streaming Sinks. + * To implement custom sinks, extend your classes with this trait. */ trait Sink extends Loggable with Serializable { - val metricName: String + + val jobName: String Review comment: It's better to unify the names of variable, and easier to understand. In `DQConfig` is `name`, in `BatchDQApp` is `metricName`, in `DQContext` is `name`, in `SinkFactory` is `jobName`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 450388) Time Spent: 0.5h (was: 20m) > Standardize Sink Hierarchy > -- > > Key: GRIFFIN-305 > URL: https://issues.apache.org/jira/browse/GRIFFIN-305 > Project: Griffin > Issue Type: Sub-task >Reporter: Chitral Verma >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-305) Standardize Sink Hierarchy
[ https://issues.apache.org/jira/browse/GRIFFIN-305?focusedWorklogId=447980=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447980 ] ASF GitHub Bot logged work on GRIFFIN-305: -- Author: ASF GitHub Bot Created on: 18/Jun/20 18:48 Start Date: 18/Jun/20 18:48 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #575: URL: https://github.com/apache/griffin/pull/575#issuecomment-646243010 @guoyuepeng @wankunde Can you please review this. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447980) Time Spent: 20m (was: 10m) > Standardize Sink Hierarchy > -- > > Key: GRIFFIN-305 > URL: https://issues.apache.org/jira/browse/GRIFFIN-305 > Project: Griffin > Issue Type: Sub-task >Reporter: Chitral Verma >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-305) Standardize Sink Hierarchy
[ https://issues.apache.org/jira/browse/GRIFFIN-305?focusedWorklogId=447342=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447342 ] ASF GitHub Bot logged work on GRIFFIN-305: -- Author: ASF GitHub Bot Created on: 17/Jun/20 15:06 Start Date: 17/Jun/20 15:06 Worklog Time Spent: 10m Work Description: chitralverma opened a new pull request #575: URL: https://github.com/apache/griffin/pull/575 **What changes were proposed in this pull request?** Currently, the implementation of `Sinks` in Griffin poses the below issues. This PR aims at fixing these issues. - `Sinks` are based on the recursive MultiSink class which is a sink itself but the underlying implementation is that of a `Seq` which causes ambiguity and isn't much useful. This has been removed. - Some unused code like `SinkContext` has been removed. - Data is converted from the performant DataFrame to RDD while persisting in both streaming and batch pipelines. A new method `sinkBatchRecords` has been added to allow operations directly on DataFrame for batch pipelines. Streaming will still use the old implementation which will be replaced with structured streaming. - Refactored the methods of `Sink` like changed `start`/ `finish` to `open`/ `close` and `jobName` was incorrectly passed as `metricName`. - Presently, only one instance of a sink with a given type can be defined in the env config. This will not allow the cases where you want to configure multiple sinks of same type like HDFS or JDBC. Added sink `name` to env config which is used to define the sink that should be used in the job config also. - Updated all sinks as per the changes above. With some additional changes to ConsoleSink **Does this PR introduce any user-facing change?** Yes. As mentioned above, the sink config has changed in env and job configs. How was this patch tested? Griffin test suite and additional unit test cases This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447342) Remaining Estimate: 0h Time Spent: 10m > Standardize Sink Hierarchy > -- > > Key: GRIFFIN-305 > URL: https://issues.apache.org/jira/browse/GRIFFIN-305 > Project: Griffin > Issue Type: Sub-task >Reporter: Chitral Verma >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-326) New implementation for Elasticsearch Data Connector (Batch)
[ https://issues.apache.org/jira/browse/GRIFFIN-326?focusedWorklogId=441228=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-441228 ] ASF GitHub Bot logged work on GRIFFIN-326: -- Author: ASF GitHub Bot Created on: 04/Jun/20 11:09 Start Date: 04/Jun/20 11:09 Worklog Time Spent: 10m Work Description: chitralverma commented on pull request #569: URL: https://github.com/apache/griffin/pull/569#issuecomment-638782290 @guoyuepeng Seems like a build has been running for this for month now. Can you check this. https://travis-ci.org/github/apache/griffin/builds/694599100?utm_source=github_status_medium=notification This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 441228) Time Spent: 4h 20m (was: 4h 10m) > New implementation for Elasticsearch Data Connector (Batch) > --- > > Key: GRIFFIN-326 > URL: https://issues.apache.org/jira/browse/GRIFFIN-326 > Project: Griffin > Issue Type: Sub-task >Reporter: Chitral Verma >Priority: Major > Time Spent: 4h 20m > Remaining Estimate: 0h > > The current implementation of Elasticsearch relies on sending post requests > from the driver using either SQL or search mode for query filtering. > This implementation has the following potential issues, > * Data is fetched for indexes (database scopes in ES) in bulk via 1 call on > the driver. If the index has a lot of data, due to the big response payload, > a bottleneck would be created on the driver. > * Further, the driver then needs to parse this response payload and then > parallelize it, this is again a driver side bottleneck as each JSON record > needs to be mapped to a set schema in a type-safe manner. > * Only _host_, _port_ and _version_ are the available options to configure > the connection to the ES node or cluster. > * Source partitioning logic is not carried forward when parallelizing > records, the records will be randomized due to the Spark's default > partitioning > * Even though this implementation is a first-class member of Apache Griffin, > yet it's based on the _custom_ connector trait. > The proposed implementation aims to, > * Deprecate the current implementation in favor of the direct official > [elasticsearch-hadoop|[https://github.com/elastic/elasticsearch-hadoop/tree/master/spark/sql-20]] > library. > * This library is built on DataSource API built on spark 2.2.x+ and thus > brings support for filter pushdowns, column pruning, unified read and write > and additional optimizations. > * Many configuration options are available for ES connectivity, [check > here|[https://github.com/elastic/elasticsearch-hadoop/blob/master/mr/src/main/java/org/elasticsearch/hadoop/cfg/ConfigurationOptions.java]] > * Any filters can be applied as expressions directly on the data frame and > are pushed automatically to the source. > The new implementation will look something like, > {code:java} > sparkSession.read.format("es").options( ??? ).load(""){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GRIFFIN-326) New implementation for Elasticsearch Data Connector (Batch)
[ https://issues.apache.org/jira/browse/GRIFFIN-326?focusedWorklogId=441227=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-441227 ] ASF GitHub Bot logged work on GRIFFIN-326: -- Author: ASF GitHub Bot Created on: 04/Jun/20 11:01 Start Date: 04/Jun/20 11:01 Worklog Time Spent: 10m Work Description: asfgit closed pull request #569: URL: https://github.com/apache/griffin/pull/569 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 441227) Time Spent: 4h 10m (was: 4h) > New implementation for Elasticsearch Data Connector (Batch) > --- > > Key: GRIFFIN-326 > URL: https://issues.apache.org/jira/browse/GRIFFIN-326 > Project: Griffin > Issue Type: Sub-task >Reporter: Chitral Verma >Priority: Major > Time Spent: 4h 10m > Remaining Estimate: 0h > > The current implementation of Elasticsearch relies on sending post requests > from the driver using either SQL or search mode for query filtering. > This implementation has the following potential issues, > * Data is fetched for indexes (database scopes in ES) in bulk via 1 call on > the driver. If the index has a lot of data, due to the big response payload, > a bottleneck would be created on the driver. > * Further, the driver then needs to parse this response payload and then > parallelize it, this is again a driver side bottleneck as each JSON record > needs to be mapped to a set schema in a type-safe manner. > * Only _host_, _port_ and _version_ are the available options to configure > the connection to the ES node or cluster. > * Source partitioning logic is not carried forward when parallelizing > records, the records will be randomized due to the Spark's default > partitioning > * Even though this implementation is a first-class member of Apache Griffin, > yet it's based on the _custom_ connector trait. > The proposed implementation aims to, > * Deprecate the current implementation in favor of the direct official > [elasticsearch-hadoop|[https://github.com/elastic/elasticsearch-hadoop/tree/master/spark/sql-20]] > library. > * This library is built on DataSource API built on spark 2.2.x+ and thus > brings support for filter pushdowns, column pruning, unified read and write > and additional optimizations. > * Many configuration options are available for ES connectivity, [check > here|[https://github.com/elastic/elasticsearch-hadoop/blob/master/mr/src/main/java/org/elasticsearch/hadoop/cfg/ConfigurationOptions.java]] > * Any filters can be applied as expressions directly on the data frame and > are pushed automatically to the source. > The new implementation will look something like, > {code:java} > sparkSession.read.format("es").options( ??? ).load(""){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)