[ANNOUNCE] Apache Griffin-0.3.0-incubating released
Hi all, The Apache Griffin (incubating) team is pleased to announce the release of Griffin 0.3.0-incubating. Apache Griffin is data quality solution for modern data system, it defines a standard process to define, measure data quality for well-known dimensions. The release is available at: https://www.apache.org/dyn/closer.cgi/incubator/griffin Thanks, The Apache Griffin (incubating) team = *DISCLAIMER* Apache Griffin is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
Re: [VOTE] Release of Apache Griffin-0.3.0-incubating [RC1]
PMC, Please verify for us, Thanks. copy Henry (griffin mentor) vote here from dev list. == +1 (binding) LICENSE file looks good NOTICE file looks good DISCLAIMER file looks good NO exes in source artifacts License header exists Source compile. On Wed, Sep 5, 2018 at 3:02 PM Henry Saputra wrote: > +1 (binding) > > LICENSE file looks good > NOTICE file looks good > DISCLAIMER file looks good > NO exes in source artifacts > License header exists > Source compile. > > - Henry > > On Thu, Aug 30, 2018 at 7:44 AM Lionel Liu wrote: > > > Hi all, > > > > This is a call for a vote on releasing Apache Griffin > 0.3.0-incubating, > > release candidate 1. > > Apache Griffin is data quality service for modern data system, it > > defines a standard process to define,measure data quality for well-known > > dimensions. > > With Apache Griffin, users will be able to quickly define their data > > quality requirements and then get the result in near real time in > > systematical approach. > > > > > > ** Highlights ** > > * Refactor measure module for better abstraction. > > * Support missing records download for accuracy measurement. > > * Support regular expression detection count in profiling > measurement. > > * Fix several bugs on UI. > > > > > > The source tarball, including signatures, digests, etc. can be found > > at: > > * > > > https://dist.apache.org/repos/dist/dev/incubator/griffin/0.3.0-incubating/ > > < > > > https://dist.apache.org/repos/dist/dev/incubator/griffin/0.3.0-incubating/ > > >* > > > > The tag to be voted upon is 0.3.0-incubating: > > * > > > https://git-wip-us.apache.org/repos/asf?p=incubator-griffin.git;a=shortlog;h=refs/tags/griffin-0.3.0-incubating > > < > > > https://git-wip-us.apache.org/repos/asf?p=incubator-griffin.git;a=shortlog;h=refs/tags/griffin-0.3.0-incubating > > >* > > > > The release hash is : > > * > > > https://git-wip-us.apache.org/repos/asf?p=incubator-griffin.git;a=commit;h=797cc62c94449e485d3af910bc8557ca9841bb22 > > < > > > https://git-wip-us.apache.org/repos/asf?p=incubator-griffin.git;a=commit;h=797cc62c94449e485d3af910bc8557ca9841bb22 > > >* > > > > The Nexus Staging URL: > > * > > https://repository.apache.org/content/repositories/orgapachegriffin-1018 > > < > https://repository.apache.org/content/repositories/orgapachegriffin-1018 > > >* > > > > > > Release artifacts are signed with the following key: > > 7F00C3BA90F3ECAEECB843A79BD6EC6C02379561 > > KEYS file available: > > https://dist.apache.org/repos/dist/dev/incubator/griffin/KEYS > > > > For information about the contents of this release, see: > > * > > > https://dist.apache.org/repos/dist/dev/incubator/griffin/0.3.0-incubating/CHANGES.txt > > < > > > https://dist.apache.org/repos/dist/dev/incubator/griffin/0.3.0-incubating/CHANGES.txt > > >* > > > > > > Please vote on releasing this package as Apache Griffin > > 0.3.0-incubating > > > > > > The vote will be open for 72 hours. > > > > [ ] +1 Release this package as Apache Griffin 0.3.0-incubating > > [ ] +0 no opinion > > [ ] -1 Do not release this package because ... > > > > > > You can follow the steps here to verify the release before you vote: > > https://cwiki.apache.org/confluence/display/GRIFFIN/How+to+Verify+ > > Release+Package > > > > > > Thanks, > > Lionel > > On behalf of Apache Griffin PPMC > > >
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606643#comment-16606643 ] Lionel Liu commented on GRIFFIN-190: Hi [~mkisly], When using something sparkJob.file=hdfs://localhost:9000/griffin/griffin-measure.jar we get an error wrong fs expected [file:///] . Where did you get this error? In griffin service log or in livy log? Suppose it would be in livy log, in griffin service, we do NOT parse the value of "sparkJob.file" as a path, we just directly send the string value to livy as the value of "file" filed like this: "file": "hdfs://localhost:9000/griffin/griffin-measure.jar". In application.properties, "fs.defaultFS" is only used to check done file existence, it will not affect the spark job submission. I guess there might be some issue of the environment. I'm not sure how's your livy and spark configured, maybe you can refer to our docker image built up scripts: [https://github.com/bhlx3lyx7/griffin-docker/tree/master/env2/conf/spark] [https://github.com/bhlx3lyx7/griffin-docker/tree/master/env2/conf/livy] Or the error might be caused by the other parameters like: "sparkJob.jars" or "spark.yarn.dist.files", they also affect if you need enable Hive Context when submitting spark jobs. > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen
[ https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606483#comment-16606483 ] Michael Kisly commented on GRIFFIN-190: --- Hi Lionel, The job appears to be failing somewhere between the interaction with livy/spark and hdfs. When we run a postman request to livy 8998 and use something like the following "file": "hdfs://localhost:9000/example/" it seems to resolve the path fine. When running the griffin job livy reports in an error in its logs of the following: ERROR SparkProcApp: spark-submit exited with code 1 . I've found some different behavior when editing the path to the jar files in the sparkJob.Properties but am unsure of how the address should be specified. When using something sparkJob.file=hdfs://localhost:9000/griffin/griffin-measure.jar we get an error wrong fs expected file:/// so we then changed the path of the jars to hdfs:///griffin/ and hdfs:///livy/ however now we just get that 1 error code. Also in the application.Properties we have the following specified fs.defaultFS = hdfs://localhost:9000 . No applications appear to get created in yarn Thanks, Mike > Blank Health and DQ Metrics Screen > -- > > Key: GRIFFIN-190 > URL: https://issues.apache.org/jira/browse/GRIFFIN-190 > Project: Griffin (Incubating) > Issue Type: Bug >Affects Versions: 0.2.0-incubating >Reporter: Cory Woytasik >Priority: Major > > Griffin is up and running. We have both an accuracy measure and a profiling > measure that is set to run every minute via jobs. When we click the chart > icon next to the job we receive a "no content" message. When we click on the > Health link or DQ Metrics link they think for a second and then display a > blank screen. We are thinking this might be ES related, but aren't > completely sure. Need some help. We assume it's a path or property setup > issue. Here are the versions we are running: > Hive - 3.1.0 > Elasticsearch - 5.3.1 > griffin - 0.2.0 > hadoop - 3.1.1 > livy - 0.3.0 > spark - 2.3.1 > Using postgres too -- This message was sent by Atlassian JIRA (v7.6.3#76005)