[ANNOUNCE] Apache Griffin-0.3.0-incubating released

2018-09-06 Thread Lionel Liu
Hi all,

The Apache Griffin (incubating) team is pleased to announce the release of
Griffin 0.3.0-incubating.

Apache Griffin is data quality solution for modern data system,
it defines a standard process to define, measure data quality for
well-known dimensions.

The release is available at:
https://www.apache.org/dyn/closer.cgi/incubator/griffin

Thanks,

The Apache Griffin (incubating) team

=
*DISCLAIMER*
Apache Griffin is an effort undergoing incubation at The Apache Software
Foundation (ASF), sponsored by Incubator.
Incubation is required of all newly accepted projects until a further
review indicates that the infrastructure, communications, and decision
making process have stabilized in a manner consistent with other successful
ASF projects.
While incubation status is not necessarily a reflection of the completeness
or stability of the code, it does indicate that the project has yet to be
fully endorsed by the ASF.


Re: [VOTE] Release of Apache Griffin-0.3.0-incubating [RC1]

2018-09-06 Thread William Guo
PMC, Please verify for us,  Thanks.

copy Henry (griffin mentor) vote here from dev list.
==
+1 (binding)

LICENSE file looks good
NOTICE file looks good
DISCLAIMER file looks good
NO exes in source artifacts
License header exists
Source compile.




On Wed, Sep 5, 2018 at 3:02 PM Henry Saputra 
wrote:

> +1 (binding)
>
> LICENSE file looks good
> NOTICE file looks good
> DISCLAIMER file looks good
> NO exes in source artifacts
> License header exists
> Source compile.
>
> - Henry
>
> On Thu, Aug 30, 2018 at 7:44 AM Lionel Liu  wrote:
>
> > Hi all,
> >
> > This is a call for a vote on releasing Apache Griffin
> 0.3.0-incubating,
> > release candidate 1.
> > Apache Griffin is data quality service for modern data system, it
> > defines a standard process to define,measure data quality for well-known
> > dimensions.
> > With Apache Griffin, users will be able to quickly define their data
> > quality requirements and then get the result in near real time in
> > systematical approach.
> >
> >
> > ** Highlights **
> > * Refactor measure module for better abstraction.
> > * Support missing records download for accuracy measurement.
> > * Support regular expression detection count in profiling
> measurement.
> > * Fix several bugs on UI.
> >
> >
> > The source tarball, including signatures, digests, etc. can be found
> > at:
> > *
> >
> https://dist.apache.org/repos/dist/dev/incubator/griffin/0.3.0-incubating/
> > <
> >
> https://dist.apache.org/repos/dist/dev/incubator/griffin/0.3.0-incubating/
> > >*
> >
> > The tag to be voted upon is 0.3.0-incubating:
> > *
> >
> https://git-wip-us.apache.org/repos/asf?p=incubator-griffin.git;a=shortlog;h=refs/tags/griffin-0.3.0-incubating
> > <
> >
> https://git-wip-us.apache.org/repos/asf?p=incubator-griffin.git;a=shortlog;h=refs/tags/griffin-0.3.0-incubating
> > >*
> >
> > The release hash is :
> > *
> >
> https://git-wip-us.apache.org/repos/asf?p=incubator-griffin.git;a=commit;h=797cc62c94449e485d3af910bc8557ca9841bb22
> > <
> >
> https://git-wip-us.apache.org/repos/asf?p=incubator-griffin.git;a=commit;h=797cc62c94449e485d3af910bc8557ca9841bb22
> > >*
> >
> > The Nexus Staging URL:
> > *
> > https://repository.apache.org/content/repositories/orgapachegriffin-1018
> > <
> https://repository.apache.org/content/repositories/orgapachegriffin-1018
> > >*
> >
> >
> > Release artifacts are signed with the following key:
> > 7F00C3BA90F3ECAEECB843A79BD6EC6C02379561
> > KEYS file available:
> > https://dist.apache.org/repos/dist/dev/incubator/griffin/KEYS
> >
> > For information about the contents of this release, see:
> > *
> >
> https://dist.apache.org/repos/dist/dev/incubator/griffin/0.3.0-incubating/CHANGES.txt
> > <
> >
> https://dist.apache.org/repos/dist/dev/incubator/griffin/0.3.0-incubating/CHANGES.txt
> > >*
> >
> >
> > Please vote on releasing this package as Apache Griffin
> > 0.3.0-incubating
> >
> >
> > The vote will be open for 72 hours.
> >
> > [ ] +1 Release this package as Apache Griffin 0.3.0-incubating
> > [ ] +0 no opinion
> > [ ] -1 Do not release this package because ...
> >
> >
> > You can follow the steps here to verify the release before you vote:
> > https://cwiki.apache.org/confluence/display/GRIFFIN/How+to+Verify+
> > Release+Package
> >
> >
> > Thanks,
> > Lionel
> > On behalf of Apache Griffin PPMC
> >
>


[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen

2018-09-06 Thread Lionel Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606643#comment-16606643
 ] 

Lionel Liu commented on GRIFFIN-190:


Hi [~mkisly], 

When using something 
sparkJob.file=hdfs://localhost:9000/griffin/griffin-measure.jar we get an error 
wrong fs  expected [file:///] .

Where did you get this error? In griffin service log or in livy log? Suppose it 
would be in livy log, in griffin service, we do NOT parse the value of 
"sparkJob.file" as a path, we just directly send the string value to livy as 
the value of "file" filed like this: "file": 
"hdfs://localhost:9000/griffin/griffin-measure.jar".

In application.properties, "fs.defaultFS" is only used to check done file 
existence, it will not affect the spark job submission.

I guess there might be some issue of the environment. I'm not sure how's your 
livy and spark configured, maybe you can refer to our docker image built up 
scripts:

[https://github.com/bhlx3lyx7/griffin-docker/tree/master/env2/conf/spark]

[https://github.com/bhlx3lyx7/griffin-docker/tree/master/env2/conf/livy]

Or the error might be caused by the other parameters like: "sparkJob.jars" or 
"spark.yarn.dist.files", they also affect if you need enable Hive Context when 
submitting spark jobs.

> Blank Health and DQ Metrics Screen
> --
>
> Key: GRIFFIN-190
> URL: https://issues.apache.org/jira/browse/GRIFFIN-190
> Project: Griffin (Incubating)
>  Issue Type: Bug
>Affects Versions: 0.2.0-incubating
>Reporter: Cory Woytasik
>Priority: Major
>
> Griffin is up and running.  We have both an accuracy measure and a profiling 
> measure that is set to run every minute via jobs.  When we click the chart 
> icon next to the job we receive a "no content" message.  When we click on the 
> Health link or DQ Metrics link they think for a second and then display a 
> blank screen.  We are thinking this might be ES related, but aren't 
> completely sure.  Need some help.  We assume it's a path or property setup 
> issue.  Here are the versions we are running:
> Hive - 3.1.0
> Elasticsearch - 5.3.1
> griffin - 0.2.0
> hadoop - 3.1.1
> livy - 0.3.0
> spark - 2.3.1
> Using postgres too



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GRIFFIN-190) Blank Health and DQ Metrics Screen

2018-09-06 Thread Michael Kisly (JIRA)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606483#comment-16606483
 ] 

Michael Kisly commented on GRIFFIN-190:
---

Hi Lionel,

The job appears to be failing somewhere between the interaction with livy/spark 
and hdfs. When we run a postman request to livy 8998 and use  something like 
the following "file": "hdfs://localhost:9000/example/" it seems to resolve the 
path fine. When running the griffin job livy reports in an error in its logs of 
the following:  ERROR SparkProcApp: spark-submit exited with code 1 . I've 
found some different behavior when editing the path to the jar files in the 
sparkJob.Properties but am unsure of how the address should be specified. When 
using something sparkJob.file=hdfs://localhost:9000/griffin/griffin-measure.jar 
we get an error wrong fs  expected file:/// so we then changed the path of the 
jars to hdfs:///griffin/ and hdfs:///livy/ however now we just get that 1 error 
code. Also in the application.Properties we have the following specified 
fs.defaultFS = hdfs://localhost:9000    . No applications appear to get created 
in yarn

Thanks,

Mike   

> Blank Health and DQ Metrics Screen
> --
>
> Key: GRIFFIN-190
> URL: https://issues.apache.org/jira/browse/GRIFFIN-190
> Project: Griffin (Incubating)
>  Issue Type: Bug
>Affects Versions: 0.2.0-incubating
>Reporter: Cory Woytasik
>Priority: Major
>
> Griffin is up and running.  We have both an accuracy measure and a profiling 
> measure that is set to run every minute via jobs.  When we click the chart 
> icon next to the job we receive a "no content" message.  When we click on the 
> Health link or DQ Metrics link they think for a second and then display a 
> blank screen.  We are thinking this might be ES related, but aren't 
> completely sure.  Need some help.  We assume it's a path or property setup 
> issue.  Here are the versions we are running:
> Hive - 3.1.0
> Elasticsearch - 5.3.1
> griffin - 0.2.0
> hadoop - 3.1.1
> livy - 0.3.0
> spark - 2.3.1
> Using postgres too



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)