Re: Griffin Service Error

Lionel Liu Tue, 10 Apr 2018 22:04:12 -0700

Hi Karan,

I've read your log again, found error happens as steps bellow:


*1. You've configured hive.metastore.uris as
"thrift://azudpoc2928.ent.lolcentral.com:9083
<http://azudpoc2928.ent.lolcentral.com:9083>", which is the correct one.*
Griffin server start up and try to connect to hive metastore service, some
error occurs:
2018-04-05 09:32:20.842  WARN 106074 --- [           main] hive.metastore
                         : set_ugi() not successful, Likely cause: new
client talking to old server. Continuing without it.
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Connection reset

*But immediately, it succeed: *
2018-04-05 09:32:20.843  INFO 106074 --- [           main] hive.metastore
                         : Connected to metastore.

*2. Griffin service will cache the hive table metadata, and refresh it
every 15 minutes.*
The first refresh happens when start up:
2018-04-05 09:32:23.248  INFO 106074 --- [pool-4-thread-1]
o.a.g.c.m.hive.HiveMetaStoreService      : Evict hive cache

*But it fails by this error:*
2018-04-05 09:32:23.260 ERROR 106074 --- [pool-4-thread-1] hive.log
                         : Got exception:
org.apache.thrift.transport.TTransportException java.net.SocketException:
Broken pipe (Write failed)
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Broken pipe (Write failed)

*Griffin service logs this error in the cache refresh process, now the
cache is evicted but new data fetch fails:*
2018-04-05 09:32:23.263 ERROR 106074 --- [pool-4-thread-1]
o.a.g.c.m.hive.HiveMetaStoreService      : Can not get databases : Got
exception: org.apache.thrift.transport.TTransportException
java.net.SocketException: Broken pipe (Write failed)
2018-04-05 09:32:23.263  INFO 106074 --- [pool-4-thread-1]
o.a.g.c.m.hive.HiveMetaStoreService      : After evict hive
cache,automatically refresh hive tables cache.

*3. Then griffin service will try to reconnect to hive metastore
asynchronously, but every time it tries to connect, the same error occurs:*
2018-04-05 09:32:23.269  INFO 106074 --- [pool-3-thread-1] hive.metastore
                         : Trying to connect to metastore with URI thrift://
azudpoc2928.ent.lolcentral.com:9083
2018-04-05 09:32:23.279  WARN 106074 --- [pool-3-thread-1] hive.metastore
                         : set_ugi() not successful, Likely cause: new
client talking to old server. Continuing without it.
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Connection reset

But it also succeed to connect:
2018-04-05 09:32:23.280  INFO 106074 --- [pool-3-thread-1] hive.metastore
                         : Connected to metastore.

*4. And after 15 minutes, the same things happen again.*


I think there are two problems we need to investigate:

*1. Every time trying to connect hive metastore, error occurs but succeed
immediately. *
I've googled this error message and found this:
https://community.hortonworks.com/questions/146939/extration-warn-hivemetastore-set-ugi-not-successfu.html
Seems like too many client connections to hive metastore service.

*2. Every time griffin evict cache and try to fetch new data using the
built connection last time, it was broken pipe, seems like the connection
lasts too short. *
I wonder it could be solved by some configuration of hive metastore
service, or it's also caused by too many connections.

Could you check about this? Hope it helps.


Thanks,
Lionel





On Tue, Apr 10, 2018 at 8:57 PM, Karan Gupta <karan.gu...@tavant.com> wrote:

> Hi Lionel,
>
>
>
> hive.metastore.uris is correctly set as per my knowledge in the
> application.properties. Could you suggest any alternative or a work around?
>
>
>
>
>
> Thank you,
>
> Karan Gupta
>
>
>
> *From:* Lionel, Liu <bhlx3l...@163.com>
> *Sent:* Tuesday, April 10, 2018 6:00 PM
> *To:* Karan Gupta <karan.gu...@tavant.com>; dev@griffin.incubator.apache.o
> rg
> *Subject:* RE: Griffin Service Error
>
>
>
> Hi Karan,
>
>
>
> It seems like connect hive metastore service fails, you need to configure
> "hive.metastore.uris” as the correct one in application.properties.
>
>
>
> Thanks
> Lionel, Liu
>
>
>
> *From: *Karan Gupta <karan.gu...@tavant.com>
> *Sent: *2018年4月10日 17:59
> *To: *Lionel, Liu <bhlx3l...@163.com>
> *Subject: *Griffin Service Error
>
>
>
> Hi Lionel,
>
>
>
> I am encountering the below error when I try to run the griffin service jar
>
>
>
> [image: cid:image001.png@01D3D0E0.ABE994E0]
>
>
>
>
>
> Any guidance would be very helpful.
>
>
>
> Thank you,
>
> Karan Gupta
>
> *From:* Vinod Raina
> *Sent:* Monday, April 9, 2018 2:18 PM
> *To:* Lionel, Liu <bhlx3l...@163.com>
> *Cc:* Karan Gupta <karan.gu...@tavant.com>
> *Subject:* RE: Few Questions about Griffin
>
>
>
> Thank you Lionel, this information helps J ..
>
>
>
>
>
>
>
> *Regards*
>
> *Vinod Raina* | vinod.ra...@tavant.com
>
> Associate Technical Architect
>
> M: +91 9711022965
>
>
>
> *From:* Lionel, Liu <bhlx3l...@163.com>
> *Sent:* Saturday, April 7, 2018 1:32 PM
> *To:* Vinod Raina <vinod.ra...@tavant.com>; dev@griffin.incubator.apache.o
> rg
> *Cc:* Karan Gupta <karan.gu...@tavant.com>
> *Subject:* RE: Few Questions about Griffin
>
>
>
> Hi Vinod,
>
>
>
> For the first question, it looks like the validity dimension, to measure
> the data item by the rules defined. The validity dimension has not been
> implemented in griffin, but you can also make it work by profiling at
> current. For example, you can define the profiling rule as “select
> count(*) from source where len(telephone) = 10 and name is not null”,
> that will produce the count of items matched such a rule, with another
> metric as total count, then you’ll get the percentage. In fact, getting
> the count metrics is better than getting the percentage directly.
>
> For the second question, I’m not very familiar with Kerberos, but in
> eBay, we’re also using hdfs cluster with Kerberos authentication. Griffin
> measure module works as a spark application, and it supports all the spark
> parameters, so it should work in the same way like you submit other spark
> applications on your cluster. If not correct pls tell me, thanks.
>
>
>
> Thanks
> Lionel, Liu
>
>
>
> *From: *Vinod Raina <vinod.ra...@tavant.com>
> *Sent: *2018年4月5日 13:09
> *To: *Lionel Liu <lionel...@apache.org>; dev@griffin.incubator.apache.org
> *Cc: *Karan Gupta <karan.gu...@tavant.com>
> *Subject: *RE: Few Questions about Griffin
>
>
>
> Thank you Lionel,
>
> I have 2 more follow queries :
>
>    1. My requirement is to check the data quality in terms of whether the
>    data confirms to the data types that I expect it to be. E.g One column may
>    have telephone number, so I expect it to be 10 digit number , another
>    column is birthdate, so I expect it to be in a date format or there is a
>    name column and I don’t want it to be null/missing. So I need to create a
>    metric report where I can get to see the percentage of data that confirms
>    to the validations that we have created. Can griffin do that ?
>    2. Also, Our HDFS is a kerberised cluster. Can griffin work on a
>    kerberised cluster ?
>
>
>
>
>
>
>
> *Regards*
>
> *Vinod Raina* | vinod.ra...@tavant.com
>
> Associate Technical Architect
>
> M: +91 9711022965
>
>
>
> *From:* Lionel Liu <lionel...@apache.org>
> *Sent:* Tuesday, April 3, 2018 2:16 PM
> *To:* dev@griffin.incubator.apache.org; Vinod Raina <
> vinod.ra...@tavant.com>
> *Cc:* Karan Gupta <karan.gu...@tavant.com>
> *Subject:* Re: Few Questions about Griffin
>
>
>
> Hi Vinod,
>
>
>
> We're glad to receive your email, there're some other documents of Griffin
> listed below:
>
> wiki: https://cwiki.apache.org/confluence/display/GRIFFIN/Apache+Griffin
> <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FGRIFFIN%2FApache%2BGriffin&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=K1%2Be1%2F%2F3xdxV7Y9HMDwAeOS3Us6x1L2lGw6hD1WcdGg%3D&reserved=0>
>
> github: https://github.com/apache/incubator-griffin/tree/master/grif
> fin-doc
> <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Ftree%2Fmaster%2Fgriffin-doc&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=XsJDny0l9frweakqLEMPMpTgtLCdJWBer59QcDaIi%2Bk%3D&reserved=0>
>
> And you can follow https://github.com/apache/incu
> bator-griffin/blob/master/griffin-doc/docker/griffin-docker-guide.md
> <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Fblob%2Fmaster%2Fgriffin-doc%2Fdocker%2Fgriffin-docker-guide.md&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=gV%2FwnKgcBn3CaphB636zwFz5llJPMOuKmxlgQE0Oqf0%3D&reserved=0>
> to try griffin docker image.
>
>
>
> For your questions, I'll list my answers:
>
>
>
> *1. What is the usage of accuracy metric? In what situations, it will be
> useful?*
>
>
>
> Accuracy measures the match percentage between two data sources, we call
> them "target" and "source", "source" is the data source you trust, "target"
> is the data source you want to check.
>
> For example, say "source" is [1, 2, 3, 4, 5], while "target" is [1, 3, 5,
> 7, 9], we'll get the accuracy #(target items matched in source) / #(all
> target items) = 3/5 = 80%. Actually, "exactly match" is a narrow concept,
> in accuracy, we say "pass the match rule", users can define their own
> "match rule" like "source.age <= target.age AND upper(source.city) =
> upper(target.city)" instead of "exactly match".
>
> When we have a data source we trust, let it be the "source", then we can
> measure accuracy of another data source named "target", to figure out how
> correctly we can trust.
>
>
>
> There's a standard use case:
>
> In our data pipeline, when we get users' data from site, we persist it as
> table T1, which we trust it as the source of truth. On the other hand, a
> copy of users' data will be pushed to some streaming or batch processes,
> after some steps, the processed data is persisted as table T2, we want to
> know how correct it is, or how much we can trust it.
>
> Set T1 as "source", T2 as "target", we can get the accuracy of T2, with
> the wrong items from T2 persisted.
>
>
>
> And another specific use case:
>
> We have a streaming data process system, it consumes data from input and
> produces to output. In each output data item, it also contains the key of
> input item, we want to know how much data is successfully processed.
>
> Set output as "source", input as "target", we can get the accuracy of
> input, and the missing items from input will be persisted.
>
> Actually, this case measures the completeness of output, but it works like
> reversed accuracy, so we can use it like this.
>
>
>
> However, in griffin measure configuration, the concept of source and
> target are based on the code implementation, which is different from the
> business concept above. In the documents of measure configuration, we're
> measuring accuracy of "source".
>
> We are planning to modify the code implementation to be align with the
> business concept later, by then, we'll highlight it in the release notes.
>
>
>
>
>
> *2. Can we run other metrics using command-line? (or) Is only accuracy
> metric supported at the moment?*
>
>
>
> Yes, you can just run griffin measure module using cmd-line directly, like
> this: https://github.com/bhlx3lyx7/griffin-docker/blob/master/svc_
> msr_new/prep/measure/start-accu.sh
> <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fbhlx3lyx7%2Fgriffin-docker%2Fblob%2Fmaster%2Fsvc_msr_new%2Fprep%2Fmeasure%2Fstart-accu.sh&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=Tv0zkOEV3gy0sZXo3HJeZ6%2BG3qw1qGXEbAt1O0VAr1k%3D&reserved=0>
> .
>
> At current, griffin UI module doesn't support all the dimensions, but
> measure module supports accuracy, profiling, timeliness and uniqueness, you
> can get some description of them here: https://github.com/apache/incu
> bator-griffin/blob/master/griffin-doc/measure/dsl-guide.md#
> griffin-dsl-translation-to-sql
> <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Fblob%2Fmaster%2Fgriffin-doc%2Fmeasure%2Fdsl-guide.md%23griffin-dsl-translation-to-sql&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=SK4UDSZabQmU4215b9WSPY75qm5fcf5Ed%2BbjJGjWwdQ%3D&reserved=0>
> .
>
>
>
>
>
> *3. Project roadmap for features?*
>
>
>
> The project roadmap is out of date, we've updated it:
> https://cwiki.apache.org/confluence/display/GRIFFIN/0.+Roadmap
> <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FGRIFFIN%2F0.%2BRoadmap&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=2W2bVULf8eJQboeUuV8%2BRNKyyP84%2BANEo1sCAoMrGlM%3D&reserved=0>
>
> Some new features we're planning in the short term planning:
>
> - streaming measure job schedule.
>
> - more data quality dimensions support, such as completeness, consistency,
> validity.
>
> And for long term, maybe including:
>
> - more data sources support, such as RDBs, elasticsearch.
>
> - anomaly detection support.
>
> - spark 2 support.
>
>
>
>
>
> *4. Can we use create custom Rules and profile existing data?*
>
>
>
> Yes, you can create custom rules for your data, according to the
> documents: https://github.com/apache/incubator-griffin/blob/master/grif
> fin-doc/measure/measure-configuration-guide.md
> <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Fblob%2Fmaster%2Fgriffin-doc%2Fmeasure%2Fmeasure-configuration-guide.md&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=IWI0GSPaDRWkqJ3mj%2B%2FtvP7tGq0BnqJp8RUNeQt%2FnTg%3D&reserved=0>
> and https://github.com/apache/incubator-griffin/blob/master/grif
> fin-doc/measure/measure-batch-sample.md
> <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Fblob%2Fmaster%2Fgriffin-doc%2Fmeasure%2Fmeasure-batch-sample.md&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=ZLXrMHoB2TGto5H2f9dnERtvenhxE3b1qnwiFQIi7UA%3D&reserved=0>
> .
>
> The profiling rule supports simple spark-sql syntax directly, as
> https://github.com/apache/incubator-griffin/blob/master/grif
> fin-doc/measure/dsl-guide.md#profiling
> <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Fblob%2Fmaster%2Fgriffin-doc%2Fmeasure%2Fdsl-guide.md%23profiling&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=EUnidToEM3LsPbP%2Fi7UjQZMT1Hmi6HGVoEfPH7e1574%3D&reserved=0>
> described.
>
> If you want to use spark-sql, you can also define the rules like this:
> https://github.com/apache/incubator-griffin/blob/master/grif
> fin-doc/measure/dsl-guide.md#spark-sql
> <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Fblob%2Fmaster%2Fgriffin-doc%2Fmeasure%2Fdsl-guide.md%23spark-sql&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=HHhEqZn6e1bpzPV7BuSQ7CpaahKDXNZsbJBLiTU1cGc%3D&reserved=0>
> .
>
>
>
>
>
> *5. Postgresql and mysql -- both listed in Prerequisites. We have MySQL,
> Is that enough?*
>
>
>
> In fact, you can choose either one of postgresql and mysql.
>
> We use mysql for the measure and schedule persistance before, but due to
> the license issue of release, we have to switch to postgresql these days.
>
> If you want to use mysql, you need to modify some dependencies in service
> module and the application.properties file, rebuild the service.jar as well.
>
> We are going to place a document to help users for mysql or other db.
>
>
>
>
>
> Hope this helps you, please feel free if any question.
>
>
>
> Thanks,
>
> Lionel
>
>
>
> On Tue, Apr 3, 2018 at 1:41 PM, Vinod Raina <vinod.ra...@tavant.com>
> wrote:
>
> Hi Griffin team,
> In our team, We are looking to create a Data Quality model for your EDL
> Ingestion and are exploring Apache Griffin for it. We have gone through the
> documentation. The documentation is still not complete but we understand
> that the project is in incubation and there might be other reasons as well.
> It would be really helpful if there is any other source of information
> (other than the apache portal  and the git hub readme ) which can help us
> to understand the usage of this framework.
> Also ,we have below few question and would really if you can help us with
> the answers :
>
> 1. What is the usage of accuracy metric? In what situations, it will be
> useful?
> 2. Can we run other metrics using command-line? (or) Is only accuracy
> metric supported at the moment?
> 3. Project roadmap for features?
> 4. Can we use create custom Rules and profile existing data?
> 5. Postgresql and mysql -- both listed in Prerequisites. We have MySQL, Is
> that enough?
>
>
>
>
> Regards
> Vinod Raina | vinod.ra...@tavant.com<mailto:vinod.ra...@tavant.com>
> Associate Technical Architect
> M: +91 9711022965
> Tavant Technologies | www.tavant.com
> <https://apac01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tavant.com&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=MnrwgHWIuurIvvm8WmPkmNwvkZV9mmfQXpb8ng9H8ug%3D&reserved=0>
> <http://www.tavant.com/
> <https://apac01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tavant.com%2F&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=5wRwGG4m3wzw8JFXIWyg%2B4a3GYZXSUV5iElBegNjGfY%3D&reserved=0>
> >
> Okaya Centre, Tower 1, 5th Floor,B-5, Sector 62, Noida, UP 201 309
>
> ________________________________
> Any comments or statements made in this email are not necessarily those of
> Tavant Technologies. The information transmitted is intended only for the
> person or entity to which it is addressed and may contain confidential
> and/or privileged material. If you have received this in error, please
> contact the sender and delete the material from any computer. All emails
> sent from or to Tavant Technologies may be subject to our monitoring
> procedures.
>
>
>
>
>
>
>

Re: Griffin Service Error

Reply via email to