[jira] [Commented] (GRIFFIN-371) Setup Griffin next architecture milestones and tasks.

2022-08-10 Thread Eugene Liu (Jira)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577969#comment-17577969
 ] 

Eugene Liu commented on GRIFFIN-371:


let's sum up requirements from all parties and step by step break down the 
tasks based on new architecture.

> Setup Griffin next architecture milestones and tasks.
> -
>
> Key: GRIFFIN-371
> URL: https://issues.apache.org/jira/browse/GRIFFIN-371
> Project: Griffin
>  Issue Type: Bug
>Reporter: William Guo
>Assignee: William Guo
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Griffin Start is failed with eclipseLinkJpaConfig bean creating failed ,what should I do ?

2020-11-19 Thread Eugene Liu
Zhengqiang,

notice the root cause came from lib missing,
`java.lang.IllegalStateException: Cannot load driver class: 
com.mysql.jdbc.Driver`

do you install mysql in test env or does the classpath include mysql jdbc lib?

Thx
Eugene


From: 谭正强 
Sent: Thursday, November 19, 2020 3:29 PM
To: dev 
Subject: Griffin Start is failed with eclipseLinkJpaConfig bean creating failed 
,what should I do ?

Dear all :
I am using griffin-0.6.0 doc to install ,all configuration is set done ,and 
complile it use command as follow:
mvn -Dmaven.test.skip=true clean install


where step to unzip service-0.6.0.tar.gz and execute the cmd as bin/start.sh,  
I met the error as follow ,what should i do ? anyone can give me help ,tks~


2020-11-19 15:13:16.023 ERROR 30139 --- [main] o.s.b.SpringApplication  
   [822] : Application run failed
org.springframework.beans.factory.UnsatisfiedDependencyException: Error 
creating bean with name 'eclipseLinkJpaConfig' defined in URL 
[jar:file:/home/zmbigdata/program/griffin-service/lib/service-0.6.0.jar!/org/apache/griffin/core/config/EclipseLinkJpaConfig.class]:
 Unsatisfied dependency expressed through constructor parameter 0; nested 
exception is org.springframework.beans.factory.BeanCreationException: Error 
creating bean with name 'dataSource' defined in class path resource 
[org/springframework/boot/autoconfigure/jdbc/DataSourceConfiguration$Hikari.class]:
 Bean instantiation via factory method failed; nested exception is 
org.springframework.beans.BeanInstantiationException: Failed to instantiate 
[com.zaxxer.hikari.HikariDataSource]: Factory method 'dataSource' threw 
exception; nested exception is java.lang.IllegalStateException: Cannot load 
driver class: com.mysql.jdbc.Driver
 at 
org.springframework.beans.factory.support.ConstructorResolver.createArgumentArray(ConstructorResolver.java:769)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.ConstructorResolver.autowireConstructor(ConstructorResolver.java:218)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.autowireConstructor(AbstractAutowireCapableBeanFactory.java:1341)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1187)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:555)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:515)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:320)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:318)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:199)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:392)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1321)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1160)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:555)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:515)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:320)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 
org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:318)
 ~[spring-beans-5.1.9.RELEASE.jar:5.1.9.RELEASE]
 at 

Re: [VOTE] Release of Apache Griffin-0.6.0

2020-11-10 Thread Eugene Liu
+1

let's move forward.

Thx

From: Kevin Yao 
Sent: Tuesday, November 10, 2020 3:24 PM
To: dev@griffin.apache.org 
Subject: Re: [VOTE] Release of Apache Griffin-0.6.0

+1

On Tue, Nov 10, 2020 at 1:40 PM 万昆  wrote:

> +1
>
>
>
>
> At 2020-11-09 15:02:26, "William Guo"  wrote:
> >Hi all,
> >
> >This is a call for a vote on releasing Apache Griffin 0.6.0, release
> >candidate 1.
> >Apache Griffin is data quality service for modern data system, it
> >defines a standard process to define,
> >measure data quality for well-known dimensions. With Apache Griffin,
> >users will be able to quickly define their data quality requirements and
> >then get the result in near real time in systematical approach.
> >
> >The source tarball, including signatures, digests, etc. can be found
> at:
> >https://dist.apache.org/repos/dist/release/griffin/0.6.0/
> >The tag to be voted upon is 0.6.0:
> >
> >
> https://gitbox.apache.org/repos/asf?p=griffin.git;a=shortlog;h=refs/tags/0.6.0
> >The release hash is :
> >
> >
> https://gitbox.apache.org/repos/asf?p=griffin.git;a=commit;h=469d47d589ba3e5e9257c42f9106388f529ba02c
> >The Nexus Staging URL:
> >
> >https://repository.apache.org/content/repositories/orgapachegriffin-1023/
> >Release artifacts are signed with the following key:
> >753AD8D8DF507D7232A9BDBD9B403B9B1BFBCC23
> >KEYS file available:
> >https://dist.apache.org/repos/dist/release/griffin/KEYS
> >For information about the contents of this release, see:
> >https://dist.apache.org/repos/dist/release/griffin/0.6.0/CHANGES.txt
> >
> >Please vote on releasing this package as Apache Griffin 0.6.0
> >The vote will be open for 72 hours.
> >
> >[ ] +1 Release this package as Apache Griffin 0.6.0
> >[ ] +0 no opinion
> >[ ] -1 Do not release this package because ...
> >
> >Thanks,
> >William
> >On behalf of Apache Griffin PMC
>


Re: [DISCUSS] Build release for 0.6.0

2020-10-26 Thread Eugene Liu
Cannot wait for 0.6.0 new release more time!

Agree to moving forward.

Thx
Eugene

From: William Guo 
Sent: Monday, October 26, 2020 11:27 AM
To: dev@griffin.apache.org 
Subject: [DISCUSS] Build release for 0.6.0

hi all,

We have implemented several features and fixed a lot of bugs since last
release,

I think it's time for apache griffin to build 0.6.0 release, what do you
think?


Thanks,
William


[jira] [Resolved] (GRIFFIN-329) Measure unit test cases failed on the condition of no docker image

2020-06-21 Thread Eugene Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/GRIFFIN-329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Liu resolved GRIFFIN-329.

Resolution: Fixed

merge the PR 580 and close this issue

> Measure unit test cases failed on the condition of no docker image 
> ---
>
> Key: GRIFFIN-329
> URL: https://issues.apache.org/jira/browse/GRIFFIN-329
> Project: Griffin
>  Issue Type: Bug
>Affects Versions: 0.6.0
>        Reporter: Eugene Liu
>Priority: Minor
> Fix For: 0.6.0
>
>
> $ mvn test
> 2020-06-12 10:39:04.046 ERROR 
> [org.testcontainers.dockerclient.DockerClientProviderStrategy] - Could not 
> find a valid Docker environment. Please check configuration. Attempted 
> configurations were:
> 2020-06-12 10:39:04.046 ERROR 
> [org.testcontainers.dockerclient.DockerClientProviderStrategy] -     
> UnixSocketClientProviderStrategy: failed with exception 
> InvalidConfigurationException (ping failed). Root cause NoSuchFileException 
> (/var/run/docker.sock)
> 2020-06-12 10:39:04.046 ERROR 
> [org.testcontainers.dockerclient.DockerClientProviderStrategy] -     
> DockerMachineClientProviderStrategy: failed with exception 
> InvalidConfigurationException (Exception when executing docker-machine status 
> )
> 2020-06-12 10:39:04.047 ERROR 
> [org.testcontainers.dockerclient.DockerClientProviderStrategy] - As no valid 
> configuration was found, execution cannot continue
> org.apache.griffin.measure.datasource.connector.batch.ElasticSearchDataConnectorTest
>  *** ABORTED ***
>   org.testcontainers.containers.ContainerFetchException: Can't get Docker 
> image: 
> RemoteDockerImage(imageName=docker.elastic.co/elasticsearch/elasticsearch-oss:6.4.1,
>  imagePullPolicy=DefaultPullPolicy())
>   at 
> org.testcontainers.containers.GenericContainer.getDockerImageName(GenericContainer.java:1279)
>   at 
> org.testcontainers.containers.GenericContainer.logger(GenericContainer.java:613)
>   at 
> org.testcontainers.elasticsearch.ElasticsearchContainer.(ElasticsearchContainer.java:49)
>   at 
> org.apache.griffin.measure.datasource.connector.batch.ElasticSearchDataConnectorTest.beforeAll(ElasticSearchDataConnectorTest.scala:43)
>   at 
> org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:212)
>   at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210)
>   at 
> org.apache.griffin.measure.datasource.connector.batch.ElasticSearchDataConnectorTest.run(ElasticSearchDataConnectorTest.scala:11)
>   at org.scalatest.Suite$class.callExecuteOnSuite$1(Suite.scala:1210)
>   at org.scalatest.Suite$$anonfun$runNestedSuites$1.apply(Suite.scala:1257)
>   at org.scalatest.Suite$$anonfun$runNestedSuites$1.apply(Suite.scala:1255)
>   ...
>   Cause: java.lang.IllegalStateException: Could not find a valid Docker 
> environment. Please see logs and check configuration
>   at 
> org.testcontainers.dockerclient.DockerClientProviderStrategy.lambda$getFirstValidStrategy$3(DockerClientProviderStrategy.java:163)
>   at 
> org.testcontainers.dockerclient.DockerClientProviderStrategy$$Lambda$30/1047873000.get(Unknown
>  Source)
>   at java.util.Optional.orElseThrow(Optional.java:290)
>   at 
> org.testcontainers.dockerclient.DockerClientProviderStrategy.getFirstValidStrategy(DockerClientProviderStrategy.java:155)
>   at 
> org.testcontainers.DockerClientFactory.getOrInitializeStrategy(DockerClientFactory.java:126)
>   at 
> org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:147)
>   at 
> org.testcontainers.LazyDockerClient.getDockerClient(LazyDockerClient.java:14)
>   at 
> org.testcontainers.LazyDockerClient.listImagesCmd(LazyDockerClient.java:12)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Queries regarding Apache Griffin

2020-06-15 Thread Eugene Liu
+ Achyut Anand in mail thread

From: Eugene Liu 
Sent: Monday, June 15, 2020 10:51 PM
To: dev@griffin.apache.org 
Subject: Re: Queries regarding Apache Griffin

Yeah,  You can absolutely just introduce 'measure' module into your project as 
Chitral  mentions. But if you hope to try it, UI could help you complete 
configuration quickly.

Thx
Eugene Liu



From: Chitral Verma 
Sent: Monday, June 15, 2020 10:40 PM
To: dev@griffin.apache.org 
Subject: Re: Queries regarding Apache Griffin

Hi Achyut,

You can pull the `measure` module from master and use that in your existing
codebase.
Data Quality expressions can be extended using `Expr2DQSteps`, you can
refer to `CompletenessExpr2DQSteps` as an example.

Let me know if you have any questions.

Regards,
Chitral Verma

On Mon, 15 Jun, 2020, 20:06 Achyut Anand,  wrote:

> Hello,
>
> This is Achyut from Cummins. My team is looking at a data profiling and
> rule based data quality tool. Apache Griffin seems like a tool that would
> help us achieve our goals. However, I wanted to know if we could integrate
> Griffin with our own applications or frameworks that we already have or do
> we always have to use Griffin's UI? Another question that I had was that
> can we define our rules for data quality and validation?
>
> Thank you and have a great day!
>
> Regards,
>
> Achyut Anand
> Data Engineer
> Cummins Inc.
> (617) 752-3389
>
>


Re: Queries regarding Apache Griffin

2020-06-15 Thread Eugene Liu
Yeah,  You can absolutely just introduce 'measure' module into your project as 
Chitral  mentions. But if you hope to try it, UI could help you complete 
configuration quickly.

Thx
Eugene Liu



From: Chitral Verma 
Sent: Monday, June 15, 2020 10:40 PM
To: dev@griffin.apache.org 
Subject: Re: Queries regarding Apache Griffin

Hi Achyut,

You can pull the `measure` module from master and use that in your existing
codebase.
Data Quality expressions can be extended using `Expr2DQSteps`, you can
refer to `CompletenessExpr2DQSteps` as an example.

Let me know if you have any questions.

Regards,
Chitral Verma

On Mon, 15 Jun, 2020, 20:06 Achyut Anand,  wrote:

> Hello,
>
> This is Achyut from Cummins. My team is looking at a data profiling and
> rule based data quality tool. Apache Griffin seems like a tool that would
> help us achieve our goals. However, I wanted to know if we could integrate
> Griffin with our own applications or frameworks that we already have or do
> we always have to use Griffin's UI? Another question that I had was that
> can we define our rules for data quality and validation?
>
> Thank you and have a great day!
>
> Regards,
>
> Achyut Anand
> Data Engineer
> Cummins Inc.
> (617) 752-3389
>
>


[jira] [Commented] (GRIFFIN-329) Measure unit test cases failed on the condition of no docker image

2020-06-11 Thread Eugene Liu (Jira)


[ 
https://issues.apache.org/jira/browse/GRIFFIN-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133888#comment-17133888
 ] 

Eugene Liu commented on GRIFFIN-329:


Make a change to fix the issue

[https://github.com/apache/griffin/pull/574]

> Measure unit test cases failed on the condition of no docker image 
> ---
>
> Key: GRIFFIN-329
> URL: https://issues.apache.org/jira/browse/GRIFFIN-329
> Project: Griffin
>  Issue Type: Bug
>Affects Versions: 0.6.0
>        Reporter: Eugene Liu
>Priority: Minor
> Fix For: 0.6.0
>
>
> $ mvn test
> 2020-06-12 10:39:04.046 ERROR 
> [org.testcontainers.dockerclient.DockerClientProviderStrategy] - Could not 
> find a valid Docker environment. Please check configuration. Attempted 
> configurations were:
> 2020-06-12 10:39:04.046 ERROR 
> [org.testcontainers.dockerclient.DockerClientProviderStrategy] -     
> UnixSocketClientProviderStrategy: failed with exception 
> InvalidConfigurationException (ping failed). Root cause NoSuchFileException 
> (/var/run/docker.sock)
> 2020-06-12 10:39:04.046 ERROR 
> [org.testcontainers.dockerclient.DockerClientProviderStrategy] -     
> DockerMachineClientProviderStrategy: failed with exception 
> InvalidConfigurationException (Exception when executing docker-machine status 
> )
> 2020-06-12 10:39:04.047 ERROR 
> [org.testcontainers.dockerclient.DockerClientProviderStrategy] - As no valid 
> configuration was found, execution cannot continue
> org.apache.griffin.measure.datasource.connector.batch.ElasticSearchDataConnectorTest
>  *** ABORTED ***
>   org.testcontainers.containers.ContainerFetchException: Can't get Docker 
> image: 
> RemoteDockerImage(imageName=docker.elastic.co/elasticsearch/elasticsearch-oss:6.4.1,
>  imagePullPolicy=DefaultPullPolicy())
>   at 
> org.testcontainers.containers.GenericContainer.getDockerImageName(GenericContainer.java:1279)
>   at 
> org.testcontainers.containers.GenericContainer.logger(GenericContainer.java:613)
>   at 
> org.testcontainers.elasticsearch.ElasticsearchContainer.(ElasticsearchContainer.java:49)
>   at 
> org.apache.griffin.measure.datasource.connector.batch.ElasticSearchDataConnectorTest.beforeAll(ElasticSearchDataConnectorTest.scala:43)
>   at 
> org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:212)
>   at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210)
>   at 
> org.apache.griffin.measure.datasource.connector.batch.ElasticSearchDataConnectorTest.run(ElasticSearchDataConnectorTest.scala:11)
>   at org.scalatest.Suite$class.callExecuteOnSuite$1(Suite.scala:1210)
>   at org.scalatest.Suite$$anonfun$runNestedSuites$1.apply(Suite.scala:1257)
>   at org.scalatest.Suite$$anonfun$runNestedSuites$1.apply(Suite.scala:1255)
>   ...
>   Cause: java.lang.IllegalStateException: Could not find a valid Docker 
> environment. Please see logs and check configuration
>   at 
> org.testcontainers.dockerclient.DockerClientProviderStrategy.lambda$getFirstValidStrategy$3(DockerClientProviderStrategy.java:163)
>   at 
> org.testcontainers.dockerclient.DockerClientProviderStrategy$$Lambda$30/1047873000.get(Unknown
>  Source)
>   at java.util.Optional.orElseThrow(Optional.java:290)
>   at 
> org.testcontainers.dockerclient.DockerClientProviderStrategy.getFirstValidStrategy(DockerClientProviderStrategy.java:155)
>   at 
> org.testcontainers.DockerClientFactory.getOrInitializeStrategy(DockerClientFactory.java:126)
>   at 
> org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:147)
>   at 
> org.testcontainers.LazyDockerClient.getDockerClient(LazyDockerClient.java:14)
>   at 
> org.testcontainers.LazyDockerClient.listImagesCmd(LazyDockerClient.java:12)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GRIFFIN-329) Measure unit test cases failed on the condition of no docker image

2020-06-11 Thread Eugene Liu (Jira)
Eugene Liu created GRIFFIN-329:
--

 Summary: Measure unit test cases failed on the condition of no 
docker image 
 Key: GRIFFIN-329
 URL: https://issues.apache.org/jira/browse/GRIFFIN-329
 Project: Griffin
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Eugene Liu
 Fix For: 0.6.0


$ mvn test

2020-06-12 10:39:04.046 ERROR 
[org.testcontainers.dockerclient.DockerClientProviderStrategy] - Could not find 
a valid Docker environment. Please check configuration. Attempted 
configurations were:

2020-06-12 10:39:04.046 ERROR 
[org.testcontainers.dockerclient.DockerClientProviderStrategy] -     
UnixSocketClientProviderStrategy: failed with exception 
InvalidConfigurationException (ping failed). Root cause NoSuchFileException 
(/var/run/docker.sock)

2020-06-12 10:39:04.046 ERROR 
[org.testcontainers.dockerclient.DockerClientProviderStrategy] -     
DockerMachineClientProviderStrategy: failed with exception 
InvalidConfigurationException (Exception when executing docker-machine status )

2020-06-12 10:39:04.047 ERROR 
[org.testcontainers.dockerclient.DockerClientProviderStrategy] - As no valid 
configuration was found, execution cannot continue

org.apache.griffin.measure.datasource.connector.batch.ElasticSearchDataConnectorTest
 *** ABORTED ***

  org.testcontainers.containers.ContainerFetchException: Can't get Docker 
image: 
RemoteDockerImage(imageName=docker.elastic.co/elasticsearch/elasticsearch-oss:6.4.1,
 imagePullPolicy=DefaultPullPolicy())

  at 
org.testcontainers.containers.GenericContainer.getDockerImageName(GenericContainer.java:1279)

  at 
org.testcontainers.containers.GenericContainer.logger(GenericContainer.java:613)

  at 
org.testcontainers.elasticsearch.ElasticsearchContainer.(ElasticsearchContainer.java:49)

  at 
org.apache.griffin.measure.datasource.connector.batch.ElasticSearchDataConnectorTest.beforeAll(ElasticSearchDataConnectorTest.scala:43)

  at 
org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:212)

  at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210)

  at 
org.apache.griffin.measure.datasource.connector.batch.ElasticSearchDataConnectorTest.run(ElasticSearchDataConnectorTest.scala:11)

  at org.scalatest.Suite$class.callExecuteOnSuite$1(Suite.scala:1210)

  at org.scalatest.Suite$$anonfun$runNestedSuites$1.apply(Suite.scala:1257)

  at org.scalatest.Suite$$anonfun$runNestedSuites$1.apply(Suite.scala:1255)

  ...

  Cause: java.lang.IllegalStateException: Could not find a valid Docker 
environment. Please see logs and check configuration

  at 
org.testcontainers.dockerclient.DockerClientProviderStrategy.lambda$getFirstValidStrategy$3(DockerClientProviderStrategy.java:163)

  at 
org.testcontainers.dockerclient.DockerClientProviderStrategy$$Lambda$30/1047873000.get(Unknown
 Source)

  at java.util.Optional.orElseThrow(Optional.java:290)

  at 
org.testcontainers.dockerclient.DockerClientProviderStrategy.getFirstValidStrategy(DockerClientProviderStrategy.java:155)

  at 
org.testcontainers.DockerClientFactory.getOrInitializeStrategy(DockerClientFactory.java:126)

  at org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:147)

  at 
org.testcontainers.LazyDockerClient.getDockerClient(LazyDockerClient.java:14)

  at org.testcontainers.LazyDockerClient.listImagesCmd(LazyDockerClient.java:12)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: 关于griffin有编译打包的问题想请教

2020-04-28 Thread Eugene Liu
Hi,

cannot see the picture in the mail, could you paste the maven output here?

你好,
邮件中看不到图片, 你能否把maven运行结果直接粘贴进来?

thx


From: 张青 <13561103...@163.com>
Sent: Tuesday, April 28, 2020 4:15 PM
To: dev@griffin.apache.org 
Subject: 关于griffin有编译打包的问题想请教


[cid:image003.png@01D61D78.2BFF7010]

您好,我在用maven编译打包griffin时,一直报图片中的这个错误。我本以为是我修改配置文件的问题,之后我把git上的源码下载来,导入到idea,未修改任何文件,等他依赖下载完成后,我在Terminal
 执行 mvn clean install 还是出现这个问题,所以希望能得到您的帮助。







发送自 Windows 10 版邮件应用




Re: Support Jupyter in Griffin

2019-12-04 Thread Eugene Liu
I think it's an idea deserved to discuss. From ecosystem perspective, Griffin 
would be supposed to provide more types of front-end, which could bring more 
options for users.

Could we conclude application scenario per different requirement?

Thx

Eugene


From: Grant 
Sent: Wednesday, December 4, 2019 9:32 AM
To: dev@griffin.apache.org 
Subject: Support Jupyter in Griffin

Hi Everyone,

I have a proposal which supports Jupyter Notebook in Griffin.

The user could edit DQ configurations in the browser, then submit them to
Griffin kernel in the back-end. Griffin kernel connects to the established
Griffin backend services(measure or service).

We also could introduce JupyterLab to enable multi-user env for Griffin.

Ideas, opinions and suggestions would be welcome

Thanks

Grant Guo


Re: [ANNOUNCE] New Committer: Wan Kun

2019-09-01 Thread Eugene Liu
Welcome Kun to formally join Apache Griffin!

Making Griffin go ahead.

Thx
Eugene

From: William Guo 
Sent: Sunday, September 1, 2019 7:04 PM
To: dev@griffin.apache.org ; wan...@apache.org 

Subject: [ANNOUNCE] New Committer: Wan Kun

Hi all,

The Project Management Committee (PMC) for Apache Griffin
has invited Wan Kun to become a committer and we are pleased
to announce that he has accepted.

Kun has already made several contributions to Griffin community, submit
patches for bug fixes and contribute to new measure features. We are so
glad to have him as our new committer.

Please join me to welcome Kun.


Thanks,
William
On behalf of Apache Griffin PMC


How to configure committer’s mail

2019-06-11 Thread Eugene Liu
FYI

in case you'd like to use Apache mail account

http://griffin.apache.org/docs/contribute.html
Griffin - How to contribute
Apache Griffin is an effort undergoing incubation at The Apache Software 
Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of 
all newly accepted projects until a further review indicates that the 
infrastructure, communications, and decision making process have stabilized in 
a manner consistent with other successful ASF projects.
griffin.apache.org

Thx
Eugene


Re: [DISCUSS]Alternatives way to access hive metadata?

2019-06-11 Thread Eugene Liu
I agree.  Considering security/permission limit in a wide of variety of 
deployments, Griffin should provide a common way to get hive metadata like 
using jdbc.

Thx
Eugene

From: William Guo 
Sent: Wednesday, June 12, 2019 7:28 AM
To: dev@griffin.apache.org
Subject: Re: [DISCUSS]Alternatives way to access hive metadata?

hi all,

As I commented in ticket.

I was told about this several times,
some company has its own security policy for metadata tables.
It might a general requirement for community.

What do you think?

Thanks,
William

On Tue, Jun 11, 2019 at 11:32 AM Qian Wang  wrote:

> Hi,
>
> Griffin needs to read Hive metadata from Hive metastore, however, reading
> metadata from Hive metastore directly may have some security issue and may
> be restricted by company security team.
>
> We should provide another solution to get Hive metadata information such
> as using JDBC implementation.
>
> Here is the ticket:
> https://issues.apache.org/jira/browse/GRIFFIN-256
>
> Best,
> Eric
>


Re: Apache griffin installation document

2019-05-21 Thread Eugene Liu
Haritha,

I don't know what's problem you encountered, could you share it here?

I suppose you just need to download livy and update its configuration, and then 
start it.

Thx
Eugene

From: Haritha Reddy 
Sent: Tuesday, May 21, 2019 6:00 PM
To: Eugene Liu
Cc: dev@griffin.apache.org
Subject: Re: Apache griffin installation document

Hi,

Thank you for the link. I have completed installing java spark scala mysql hive 
and hadoop by following the link. But while installing livy I'm facing issues.
Please help me with the detailed installation doc of livy

Regards,
Haritha

On Fri, May 17, 2019 at 8:31 AM Eugene Liu 
mailto:liu...@apache.org>> wrote:
Hi Haritha

We have no document to describe installation in unix os, but you can get a 
step-by-step installation guide in ubuntu os, hope it  will help you.

https://github.com/apache/griffin/blob/master/griffin-doc/deploy/deploy-guide.md
[https://avatars3.githubusercontent.com/u/47359?s=400=4]<https://github.com/apache/griffin/blob/master/griffin-doc/deploy/deploy-guide.md>
griffin/deploy-guide.md at master · apache/griffin · 
GitHub<https://github.com/apache/griffin/blob/master/griffin-doc/deploy/deploy-guide.md>
Hadoop (2.6.0 or later), you can get some helps here.. Hive (version 2.x), you 
can get some helps here.. Spark (version 2.2.1), if you want to install Pseudo 
Distributed/Single Node Cluster, you can get some helps here.. Livy, you can 
get some helps here.. ElasticSearch (5.0 or later versions). ElasticSearch 
works as a metrics collector, Apache Griffin produces metrics into it, and our 
default ...
github.com<http://github.com>


thx
Eugene

From: Haritha Reddy 
mailto:harithareddy...@gmail.com>>
Sent: Thursday, May 16, 2019 4:53 PM
To: dev@griffin.apache.org<mailto:dev@griffin.apache.org>
Subject: Apache griffin installation document

Hi,
can you please send me the detailed document of how to install and
configure apache griffin in unix os.


Re: Apache griffin installation document

2019-05-16 Thread Eugene Liu
Hi Haritha

We have no document to describe installation in unix os, but you can get a 
step-by-step installation guide in ubuntu os, hope it  will help you.

https://github.com/apache/griffin/blob/master/griffin-doc/deploy/deploy-guide.md
[https://avatars3.githubusercontent.com/u/47359?s=400=4]
griffin/deploy-guide.md at master · apache/griffin · 
GitHub
Hadoop (2.6.0 or later), you can get some helps here.. Hive (version 2.x), you 
can get some helps here.. Spark (version 2.2.1), if you want to install Pseudo 
Distributed/Single Node Cluster, you can get some helps here.. Livy, you can 
get some helps here.. ElasticSearch (5.0 or later versions). ElasticSearch 
works as a metrics collector, Apache Griffin produces metrics into it, and our 
default ...
github.com


thx
Eugene

From: Haritha Reddy 
Sent: Thursday, May 16, 2019 4:53 PM
To: dev@griffin.apache.org
Subject: Apache griffin installation document

Hi,
can you please send me the detailed document of how to install and
configure apache griffin in unix os.


Re: Griffin docs or videos help

2019-04-28 Thread Eugene Liu
Hi Amarnath

you can get simple usecases from http://griffin.apache.org/docs/quickstart.html

if you wanna build the whole application, the guide 
https://github.com/apache/griffin/blob/master/griffin-doc/deploy/deploy-guide.md
 will help you.
[https://avatars3.githubusercontent.com/u/47359?s=400=4]

griffin/deploy-guide.md at master · apache/griffin · 
GitHub
Hadoop (2.6.0 or later), you can get some helps here.. Hive (version 2.x), you 
can get some helps here.. Spark (version 2.2.1), if you want to install Pseudo 
Distributed/Single Node Cluster, you can get some helps here.. Livy, you can 
get some helps here.. ElasticSearch (5.0 or later versions). ElasticSearch 
works as a metrics collector, Apache Griffin produces metrics into it, and our 
default ...
github.com


Griffin - Quick Start
Apache Griffin - Big Data Quality Solution For Batch and Streaming
griffin.apache.org

Thx
Eugene

From: AMARNATH 
Sent: Sunday, April 28, 2019 6:27 PM
To: dev@griffin.apache.org
Subject: Griffin docs or videos help

Hi griffin users,
We wanted to do a POC on griffin , however we are unable to find materials on 
griffin .
Could you please guide us to right tutorials or docs to get started with 
griffin .

Regards
Amarnath



**[RESULT][VOTE]Release Apache Griffin 0.5.0**

2019-04-09 Thread Eugene Liu
Thanks to everyone who has tested the release candidate and given their 
comments and votes.


The tally is as follows.
4 binding +1s:
* William Guo
* He Wang
* Kevin Yao
* Lionel Liu


No 0s or -1s.


Therefore I am delighted to announce that the proposal to release Apache 
Griffin 0.5.0 has passed.


We will be publishing the release soon.


Best regards,
Eugene Liu
On the behalf of Griffin Team



[VOTE] Release of Apache Griffin-0.5.0

2019-04-05 Thread Eugene Liu
Hi all,

This is a call for a vote on releasing Apache Griffin 0.5.0, release 
candidate. This is new milestone release of Griffin.
Apache Griffin is data quality service for modern data system, it defines a 
standard process to define,
measure data quality for well-known dimensions. With Apache Griffin, users 
will be able to quickly define their data quality requirements and then get the 
result in near real time in systematical approach.

The source tarball, including signatures, digests, etc. can be found at:
https://dist.apache.org/repos/dist/release/griffin/
The tag to be voted upon is 0.5.0:

https://gitbox.apache.org/repos/asf?p=griffin.git;a=shortlog;h=refs/tags/griffin-0.5.0
The release hash is :

https://gitbox.apache.org/repos/asf?p=griffin.git;a=commit;h=bad3fba0b7fd0bf3b286db834b053c4651e0656b
The Nexus Staging URL:

https://repository.apache.org/content/repositories/orgapachegriffin-1021
Release artifacts are signed with the following key:

DEB7B22DE1FDD7BE93368A231033FD5FC7A80CB5

KEYS file available:
 
https://dist.apache.org/repos/dist/release/griffin/KEYS

For information about the contents of this release, see:


 https://dist.apache.org/repos/dist/release/griffin/0.5.0/CHANGES.txt

Please vote on releasing this package as Apache Griffin 0.5.0

The vote will be open for 72 hours.

[ ] +1 Release this package as Apache Griffin 0.5.0
[ ] +0 no opinion
[ ] -1 Do not release this package because ...

Thanks,
Eugene
On behalf of Apache Griffin PPMC



Re: [VOTE] Release of Apache Griffin-0.5.0

2019-04-05 Thread Eugene Liu
William,

Thank for your point, I'll resend release announcement later.

Thx
Eugene

From: William Guo 
Sent: Saturday, April 6, 2019 7:40 AM
To: dev@griffin.apache.org
Subject: Re: [VOTE] Release of Apache Griffin-0.5.0

hi Eugene,

There are some old versioned links in above email, could you update them
for 0.5.0 release?

Thanks,
William

On Fri, Apr 5, 2019 at 10:41 PM Eugene Liu  wrote:

> Hi all,
>
> This is a call for a vote on releasing Apache Griffin 0.5.0, release
> candidate. This is new milestone release of Griffin.
> Apache Griffin is data quality service for modern data system, it
> defines a standard process to define,
> measure data quality for well-known dimensions. With Apache Griffin,
> users will be able to quickly define their data quality requirements and
> then get the result in near real time in systematical approach.
>
> The source tarball, including signatures, digests, etc. can be found
> at:
> https://dist.apache.org/repos/dist/release/griffin/
> The tag to be voted upon is 0.5.0:
>
> https://gitbox.apache.org/repos/asf?p=griffin.git;a=shortlog;h=refs/tags/0.5
> <
> https://gitbox.apache.org/repos/asf?p=griffin.git;a=shortlog;h=refs/tags/0.1.5-incubating
> >.0
> The release hash is :
>
> https://gitbox.apache.org/repos/asf?p=griffin.git;a=commit;h=bad3fba0b7fd0bf3b286db834b053c4651e0656b
> The Nexus Staging URL:
> https://repository.apache.org/content/repositories/orgapachegriffin-10
> <https://repository.apache.org/content/repositories/orgapachegriffin-1006
> >21
> Release artifacts are signed with the following key:
>
> DEB7B22DE1FDD7BE93368A231033FD5FC7A80CB5
>
> KEYS file available:
> <https://dist.apache.org/repos/dist/dev/incubator/griffin/KEYS>
> https://dist.apache.org/repos/dist/release/griffin/KEYS
>
> For information about the contents of this release, see:
> <
> https://dist.apache.org/repos/dist/dev/incubator/griffin/0.1.5-incubating/CHANGES.txt>
> https://dist.apache.org/repos/dist/release/griffin/0.5.0/CHANGES.txt
>
> Please vote on releasing this package as Apache Griffin 0.5.0
>
> The vote will be open for 72 hours.
>
> [ ] +1 Release this package as Apache Griffin 0.5.0
> [ ] +0 no opinion
> [ ] -1 Do not release this package because ...
>
> Thanks,
> Eugene
> On behalf of Apache Griffin PPMC
>
>


Call for vote for apache griffin 0.5.0

2019-04-05 Thread Eugene Liu
Hi all,

This is a call for a vote on releasing Apache Griffin 0.5.0, release 
candidate. This is new milestone release of Griffin.
Apache Griffin is data quality service for modern data system, it defines a 
standard process to define,
measure data quality for well-known dimensions. With Apache Griffin, users 
will be able to quickly define their data quality requirements and then get the 
result in near real time in systematical approach.

The source tarball, including signatures, digests, etc. can be found at:
https://dist.apache.org/repos/dist/release/griffin/
The tag to be voted upon is 0.5.0:

https://gitbox.apache.org/repos/asf?p=griffin.git;a=shortlog;h=refs/tags/0.5.0
The release hash is :

https://gitbox.apache.org/repos/asf?p=griffin.git;a=commit;h=bad3fba0b7fd0bf3b286db834b053c4651e0656b
The Nexus Staging URL:

https://repository.apache.org/content/repositories/orgapachegriffin-1021
Release artifacts are signed with the following key:

DEB7B22DE1FDD7BE93368A231033FD5FC7A80CB5

KEYS file available:
 
https://dist.apache.org/repos/dist/release/griffin/KEYS

For information about the contents of this release, see:


 https://dist.apache.org/repos/dist/release/griffin/0.5.0/CHANGES.txt

Please vote on releasing this package as Apache Griffin 0.5.0

The vote will be open for 72 hours.

[ ] +1 Release this package as Apache Griffin 0.5.0
[ ] +0 no opinion
[ ] -1 Do not release this package because ...

Thanks,
Eugene
On behalf of Apache Griffin PPMC



Re: Data Quality Tool Trial

2019-04-01 Thread Eugene Liu
Hi Mauricio

Welcome to Apache Griffin Community!

I think you can get the first glance from https://github.com/apache/griffin, 
concerning what the Griffin can do.  The community would like to share usage 
experience with you, if it's needed we can have a session to share something.  
Could you give more details about data quality requirements in your application?

Thx

From: Mauricio Calleja Vargas 
Sent: Friday, March 29, 2019 2:24 AM
To: dev@griffin.apache.org
Subject: Data Quality Tool Trial


Good afternoon,



Just reviewed your webpage and I’m interested in knowing the functionality of 
your tool regarding data quality. We’re a Mexican company focused on water 
treatment and just recently started Data Governance Model among the company, 
and we’re evaluating different options that fulfill our requirement.



Is there any chance that we can have a session to see the 
benefits/functionality of your tool and talk about our needs?



Best regards



[Firma Correo]



La información y documentos contenidos en este correo electrónico son 
confidenciales y están legalmente protegidos. Este correo electrónico está 
dirigido únicamente a la dirección de correo señalada. El acceso a este correo 
electrónico por cualquier otra persona no está autorizado. Si usted no es la 
persona a la cual el presente correo electrónico estaba originalmente dirigido, 
cualquier difusión, copia o distribución está prohibida y puede ser ilegal. Si 
lo ha recibido por error, por favor notifique al emisor e inmediatamente 
bórrelo de forma permanente y destruya cualquier copia impresa.

As informações e documentos contidos neste e-mail são confidenciais e 
legalmente protegidos. Este e-mail é destinado exclusivamente para o endereço 
de e-mail designado. O acesso a este e-mail por qualquer outra pessoa não é 
autorizado. Se você não é a pessoa a quem este e-mail estava previsto 
inicialmente, qualquer divulgação, cópia ou distribuição é proibida e pode ser 
ilegal. Se você recebeu esta mensagem por engano, por favor notifique o 
remetente e apague imediatamente de forma permanente e destrua qualquer cópia 
impressa.

The information and documents in this e-mail are confidential and may be 
legally privileged. It is intended solely for the addressee(s), and any access 
to this e-mail by anyone else is unauthorized. If you are not the intended 
recipient of this e-mail, any disclosure, copying, or distribution of it is 
prohibited and may be unlawful. If you have received this e-mail by mistake, 
please notify the sender and immediately and permanently delete it and destroy 
any printed copies.


Re: [DISCUSS] Build release for 0.5.0

2019-03-31 Thread Eugene Liu
Totally agree, it's time to move toward next stage.

Thx
Eugene

From: William Guo 
Sent: Monday, April 1, 2019 8:27 AM
To: dev@griffin.apache.org
Subject: [DISCUSS] Build release for 0.5.0

hi all,

We have implemented several features and fixed a lot of bugs recently,

I think it's time for apache griffin to build 0.5.0 release, what do you
think?


Thanks,
William


Do you know how to solve scala.MatchError

2019-03-05 Thread Eugene Liu
Hi,

recently I have hit a problem when griffin task tries to write result into 
elasticsearch.

http://127.0.0.1:9200/griffin/accuracy (of class 
org.apache.hadoop.fs.FsUrlConnection)
19/03/05 15:54:20 ERROR sink.SinkTaskRunner$: task fails: task 155176800 
retry ends but fails
scala.MatchError: 
org.apache.hadoop.fs.FsUrlConnection:http://127.0.0.1:9200/griffin/accuracy (of 
class org.apache.hadoop.fs.FsUrlConnection)
at scalaj.http.HttpRequest.scalaj$http$HttpRequest$$doConnection(Http.scala:343)
at scalaj.http.HttpRequest.exec(Http.scala:335)
at scalaj.http.HttpRequest.asString(Http.scala:455)

but I manually call ES rest api, it works well
/apache/griffin$ curl -i -H "Content-Type: application/json" -X POST 
http://127.0.0.1:9200/griffin/accuracy -d '{"griffin": true}'
HTTP/1.1 201 Created
Location: /griffin/accuracy/0oPnTGkBwDXqtjrq46Hg
content-type: application/json; charset=UTF-8
content-length: 178

{"_index":"griffin","_type":"accuracy","_id":"0oPnTGkBwDXqtjrq46Hg","_version":1,"result":"created","_shards":{"total":3,"successful":1,"failed":0},"_seq_no":0,"_primary_term":9}

I see there is similar bug in spark inventory, but I'm not sure.
https://issues.apache.org/jira/browse/SPARK-25694?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel


do you guys have any idea to address?

ES.YML

network.host: 127.0.0.1
http.cors.enabled: true
http.cors.allow-origin: "*"


Re: can't load text-dir data source

2019-02-21 Thread Eugene Liu
Yuchen,

could you check your hadoop config, /apache/hadoop/etc/hadoop/hdfs-site.xml

like those properties, if folder is valid


dfs.namenode.name.dir
file:///data/hadoop-data/nn


dfs.datanode.data.dir
file:///data/hadoop-data/dn


dfs.namenode.checkpoint.dir
file:///data/hadoop-data/snn



From: Yuchen Zhang 
Sent: Friday, February 22, 2019 3:10 PM
To: dev@griffin.apache.org
Subject: can't load text-dir data source


Hi there,



I’m new to Apache Griffin and Hadoop system and trying to set up a Griffin test 
env.

For now, I’m trying to get accuracy measure for the Text data source on Hadoop, 
but get below error:

2019-02-19 09:21:06 WARN  DataSource:36 - load data source [src] fails

2019-02-19 09:21:06 WARN  DataSource:36 - load data source [tgt] fails



But the data source can be load by sparksession : 
spark.read.text(“hdfs:///griffin/src/src.txt”);



Could anyone help me to figure out if there’s any problems in my configuration?



Here’s my text resource file:

[cid:image001.png@01D4CAC0.B7F42770]



Here’s the DQ config:

[cid:image002.png@01D4CAC0.B7F42770]



Thanks.




Re: Simplify Griffin-DSL implementation

2019-01-29 Thread Eugene Liu
in early stage,  I think Griffin should consider smoothy upgrade/migration 
strategy, allowing all lower versions to transform up to the latest release.

after some stable releases like 1.x, 2.x..., maybe we do not consider rolling 
upgrade.

From: Grant 
Sent: Wednesday, January 30, 2019 7:06 AM
To: dev@griffin.apache.org
Subject: Re: Simplify Griffin-DSL implementation

We could have a SQL syntax checker using the existing parser logic,

Once it detects the SQL expression with the DSL type "griffin-dsl", it
could take the following steps
1. attempt to delegate the execution of the rule to "spark-sql" type
directly. Whether the execution is successful or not, run the step 2
2. notify the user to use "spark-sql" in the future

We only keep the checker in the distribution only for several
releases(say,2 or 3). And then we remove it.

Another thing I am thinking is we should consider to support UDF provided
by the end users.

On Tue, Jan 29, 2019 at 5:35 PM Nick Sokolov  wrote:

> I think we need to maintain backward compatibility or provide easy
> (automated?) migration -- otherwise existing users will be stuck in older
> versions.
>
> On Tue, Jan 29, 2019 at 2:28 PM William Guo  wrote:
>
> > Thanks Grant.
> >
> > I agree Griffin-DSL should leverage spark-sql for sql part , and
> > Griffin-DSL should work as DQ layer to assemble different dimensions as
> > MLlib does.
> > Since we already have some experiences for data quality domain, it is now
> > for Griffin-DSL to evolve to next level.
> >
> > Thanks,
> > William
> >
> >
> > On Wed, Jan 30, 2019 at 5:48 AM Grant  wrote:
> >
> > > Hi all,
> > >
> > > I would suggest simplifying Griffin-DSL.
> > >
> > > Currently, Griffin supports three types of DSL: spark-sql, griffin-dsl
> > and
> > > df-ops respectively. In this proposal, I only focus on the first two.
> > >
> > > Griffin-DSL is a SQL-like language, supporst a wide range of clauses,
> key
> > > words, operators etc as Spark SQL. class "GriffinDslParser" also
> defines
> > > how to parse the SQL-like syntax. Actually, Griffin-DSL's SQL-like
> syntax
> > > could be covered by Spark SQL completely. Spark 2.0 substantially
> > improved
> > > SQL functionalities with SQL2003 support and can now run all 99 TPC-DS
> > > queries.
> > >
> > > So is it possible for Griffin-DSL to remove all SQL-like language
> > features?
> > > All rules, which could be expressed by SQL, would be categorized as
> > > "spark-sql" DSL type instead of "griffin-dsl". In this case, we could
> > > simplify the implementation of Griffin-DSL.
> > >
> > > For my understanding, Griffin-DSL should be the high-order expressions,
> > > each of them represents a specific set of semantics. Griffin-DSL
> > continues
> > > focusing on the expressions with the richer semantics in data
> exploration
> > > or wrangling area, and leaves all SQL compatible expressions to Spark
> > SQL.
> > > Griffin-DSL is still translated into Spark-SQL when being executed.
> > >
> > > here is an example from the unit test "_accuracy-batch-griffindsl.json"
> > >
> > > "evaluate.rule": {
> > > "rules": [
> > >   {
> > > "dsl.type": "griffin-dsl",
> > > "dq.type": "accuracy",
> > > "out.dataframe.name": "accu",
> > > "rule": "source.user_id = target.user_id AND
> > > upper(source.first_name) = upper(target.first_name) AND
> source.last_name
> > =
> > > target.last_name AND source.address = target.address AND source.email =
> > > target.email AND source.phone = target.phone AND source.post_code =
> > > target.post_code",
> > > "details": {
> > >   "source": "source",
> > >   "target": "target",
> > >   "miss": "miss_count",
> > >   "total": "total_count",
> > >   "matched": "matched_count"
> > > },
> > > "out":[
> > >   {
> > > "type": "record",
> > > "name": "missRecords"
> > >   }
> > > ]
> > >   }
> > > ]
> > >   }
> > >
> > >   If we move SQL-like syntax out of Griffin-DSL, the preceding example
> > will
> > > take "dsl.type" as "spark-sql", and "rule" would be probably a list of
> > > columns or all columns by default.
> > >
> > >   Discussions are welcomed.
> > >
> > > Grant
> > >
> >
>


Re: Analyzing past batch data periodically

2019-01-17 Thread Eugene Liu
Hi Vikram

if you hope to create a profiling job, there are user/api guides which can help 
you
https://github.com/apache/griffin/blob/master/griffin-doc/service/api-guide.md
[https://avatars3.githubusercontent.com/u/47359?s=400=4]

apache/griffin
Mirror of Apache griffin . Contribute to apache/griffin development by creating 
an account on GitHub.
github.com


https://github.com/apache/griffin/blob/master/griffin-doc/ui/user-guide.md
[https://avatars3.githubusercontent.com/u/47359?s=400=4]

griffin/user-guide.md at master · apache/griffin · 
GitHub
Apache Griffin is an open source Data Quality solution for distributed data 
systems at any scale in both streaming or batch data context. Users will 
primarily access this application from a PC. if you want to measure the match 
rate between source and target, choose accuracy. if you want to check the ...
github.com



thx
Eugene


From: Vikram Jain 
Sent: Friday, January 18, 2019 3:41 AM
To: dev@griffin.apache.org
Subject: Analyzing past batch data periodically

Hi,
I have 5 months data in my hive table which is partitioned day wise. I want to 
run a profiling job on this data and analyze the result and trend on weekly 
data i.e. I want to create the job that on each execution, processes 1 week of 
data incrementally and stores the metrics weekwise. I could not find a way to 
do this in Griffin. Can someone please help me with the solution if it exists.

Regards,
Vikram


Re: [VOTE] Release of Apache Griffin-0.4.0

2018-12-28 Thread Eugene Liu
verification items below have passed

agree+1

- CHANGES.txt updated
- source-release.zip and pom files are listed
- no md5 or sha1 files
- LICENSE file is good
- NOTICE file is good
- signature file is good
- hash file is good
- licenses in file header check success
- source compile success
- third-party licenses are good

Thx
Eugene

From: Lionel Liu 
Sent: Friday, December 28, 2018 11:14 PM
To: dev@griffin.apache.org
Cc: dev
Subject: Re: [VOTE] Release of Apache Griffin-0.4.0

+1

I've checked:
- CHANGES.txt updated
- source-release.zip and pom files are listed
- no md5 or sha1 files
- LICENSE file is good
- NOTICE file is good
- signature file is good
- hash file is good
- licenses in file header check success
- source compile success
- third-party licenses are good

PS: Considering the release process will finish after 72 hours, it should
be released before the end of this year. If it is released in 2019, the
NOTICE file would be not aligned.

Thanks,
Lionel

On Fri, Dec 28, 2018 at 6:56 PM 181276056  wrote:

> Hi all,
>
> This is a call for a vote on releasing Apache Griffin 0.4.0, release
> candidate 0.
>
> Apache Griffin is data quality service for modern data system, it
> defines a standard process to define, measure data quality for well-known
> dimensions.
> With Apache Griffin, users will be able to quickly define their data
> quality requirements and then get the result in near real time in
> systematical approach.
>
> ** Highlights **
>  - Profiling measure UX improvements.
>  - Lifecycle hooks support
>  - Plaintext mode for measure creation
>
> The source tarball, including signatures, digests, etc. can be found
> at:
> https://dist.apache.org/repos/dist/dev/griffin/0.4.0/
>
> The tag to be voted upon is 0.4.0:
>
> https://gitbox.apache.org/repos/asf?p=griffin.git;a=shortlog;h=refs/tags/griffin-0.4.0
>
> The release hash is :
>
> https://gitbox.apache.org/repos/asf?p=griffin.git;a=commit;h=56749b262ca8a5cf52b1fe42b2ba1a2eb7399d3d
>
> The Nexus Staging URL:
>
> https://repository.apache.org/content/repositories/orgapachegriffin-1020
> Release artifacts are signed with the following key:
> 6E6D ADE6 A72F F019 ED2A  E566 56DC C3C2 C3DA FFFC
>
> KEYS file available:
> https://dist.apache.org/repos/dist/dev/griffin/KEYS
> For information about the contents of this release, see:
> https://dist.apache.org/repos/dist/dev/griffin/0.4.0/CHANGES.txt
>
> Please vote on releasing this package as Apache Griffin 0.4.0
>
> The vote will be open for 72 hours.
> [ ] +1 Release this package as Apache Griffin 0.4.0
> [ ] +0 no opinion
> [ ] -1 Do not release this package because ...
>
> Thanks,
> Jason


Re: [NOTICE] We will move our griffin repo to gitbox

2018-12-26 Thread Eugene Liu
As you mentioned in previous mail, this change will not impact current 
community collaboration. We will continue to pull/push PR via github mirror, 
right?

Thx
Eugene

From: William Guo 
Sent: Wednesday, December 26, 2018 7:53 PM
To: dev@griffin.apache.org
Subject: Re: [NOTICE] We will move our griffin repo to gitbox

hi all,

Our code base have been moved to gitbox.apache.org.

Thanks,
William

On Tue, Dec 11, 2018 at 1:29 PM William Guo  wrote:

> hi all,
>
> As you might know, as required by ASF, we will move our repository from
> https://git-wip-us.apache.org/repos/asf/griffin.git
> to
> https://gitbox.apache.org/repos/asf/griffin.git
>
> We will do this in two-three days, and will get you updated on this.
>
> that should be transparent from your end.
>
> Thanks,
> William
>


Re: [VOTE] Migrate to gitbox

2018-12-21 Thread Eugene Liu
agree +1

From: Grant 
Sent: Saturday, December 22, 2018 11:42 AM
To: dev@griffin.apache.org
Subject: Re: [VOTE] Migrate to gitbox

+1

On Fri, Dec 21, 2018 at 18:17 William Guo  wrote:

> Hi all,
>
> Given the recent announcement about gitbox.apache.org [1] (seamless
> integration with GitHub)
> I'm starting this vote to migrate Apache Griffin repository from git-wip to
> gitbox.
>
> More features:
> - Easier for code review compared to the Review Board
> - JIRA linking, which will automatically link a PR with its corresponding
> JIRA
> - Leverage Github ecosystem, such as web hooks for PR monitor, Code
> Static Analytics, Coverage Report.
>
> For Griffin committers, you need to do link your Github account with ASF
> account through https://gitbox.apache.org/
>
> [ ] +1, Migrate Griffin repository to gitbox
> [ ] -1, Keep the current git-wip no change
>
>
> Regards,
> William Guo
>
> [1] : https://gitbox.apache.org/
>


Re: Board Report of Griffin

2018-12-10 Thread Eugene Liu
Board Report of Griffin review +1

Thanks

From: Lionel Liu 
Sent: Monday, December 10, 2018 5:20 PM
To: dev@griffin.apache.org
Subject: Re: Board Report of Griffin

LGTM

Thanks,
Lionel

On Sun, Dec 9, 2018 at 12:25 PM William Guo  wrote:

> Hi All,
>
> Here is the draft of the board report this month (We need to submit it
> to the board before 12 Dec). Please let me know if any thing is
> missing or you have any questions about it.
>
> ## Description:
>  - Apache Griffin is an open source Data Quality solution for Big Data,
> which supports both batch and streaming mode. It offers an unified process
> to measure your data quality from different perspectives, helping you build
> trusted data assets, therefore boost your confidence for your business.
>
> ## Issues:
>  - There are no issues requiring board attention at this time.
>
> ## Activity:
>  - Griffin PMC are working on moving code/site from incubator to griffin
> repository.
>  - Griffin will release 0.4.0 after the migration is done.
>
>
> ## Health report:
>  - The mails and commits activity are as good as usual.
>
> ## PMC changes:
>
>  - Currently 17 PMC members.
>  - Nick Sokolov is the last new PMC Member added since the last report
>  - Last PMC was added on Mon Sep 30, 2018
>
> ## Committer base changes:
>
>  - Currently 17 committers.
>  - New committers:
>  - Nick Sokolov was added as a committer on Sep 30, 2018
>
> ## Releases:
>
>  - 0.3.0 was released on Fri Sep 07, 2018
>
>
> ## Mailing list activity:
>
>  - dev@griffin.apache.org:
>  - 77 subscribes
>  - 284 emails sent by 29 people, divided into 143 topics in September.
>  - 393 emails sent by 44 people, divided into 66 topics in October.
>  - 205 emails sent by 48 people, divided into 43 topics in November.
>
>
> ## JIRA activity:
>
>  - 32 JIRA tickets created in the last 3 months
>  - 21 JIRA tickets closed/resolved in the last 3 months
>
>
>
> William Guo
>


improvement of build

2018-11-27 Thread Eugene Liu
folks,

there's static code analysis service used by other incubator project hawq, 
referring to https://scan.coverity.com/projects/apache-incubator-hawq

do you think about adding it into griffin build pipeline, probably it could 
help improve code quality.
Coverity Scan - Static 
Analysis
Note: Defect density is measured by the number of defects per 1,000 lines of 
code, identified by the Coverity platform. The numbers shown above are from our 
2013 Coverity Scan Report, which analyzed 250 million lines of open source code.
scan.coverity.com

Eugene