Hi Karan, Is your hive cluster based on kerberised cluster? I doubt that it was caused by that.
How about this: https://stackoverflow.com/questions/47533532/hivemetastoreclient-fails-to-connect-to-a-kerberized-cluster Griffin uses HiveMetaStoreClient to connect Hive metastore service, you can have a test of it directly, to solve this problem. Thanks, Lionel On Thu, Apr 12, 2018 at 2:11 PM, Karan Gupta <[email protected]> wrote: > Hi Lionel, > > > > Thank you for the reply. > > > > I did try to increase the hive.server2.thrift.max.worker.threads to 1500 > from the default 500 but it did not resolve the issue. Also we have 2 > instances of Hive Server 2 running on different machines. > > > > Could you recommend any other work around? > > > > Thank you, > > Karan Gupta > > > > *From:* Lionel Liu <[email protected]> > *Sent:* Wednesday, April 11, 2018 10:34 AM > *To:* [email protected]; Karan Gupta < > [email protected]> > *Subject:* Re: Griffin Service Error > > > > Hi Karan, > > > > I've read your log again, found error happens as steps bellow: > > > > *1. You've configured hive.metastore.uris as > "thrift://azudpoc2928.ent.lolcentral.com:9083 > <https://apac01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fazudpoc2928.ent.lolcentral.com%3A9083&data=01%7C01%7Ckaran.gupta%40tavant.com%7C8864aa5f9b3c45b8737d08d59f699ca5%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=y42fj2YVC%2F0byf%2Bph8jX%2FTtiiS5%2FOaTKD2nAHNikyN4%3D&reserved=0>", > which is the correct one.* > > Griffin server start up and try to connect to hive metastore service, some > error occurs: > > 2018-04-05 09:32:20.842 WARN 106074 --- [ main] hive.metastore > : set_ugi() not successful, Likely cause: new > client talking to old server. Continuing without it. > > org.apache.thrift.transport.TTransportException: > java.net.SocketException: Connection reset > > > > *But immediately, it succeed: * > > 2018-04-05 09:32:20.843 INFO 106074 --- [ main] hive.metastore > : Connected to metastore. > > > > *2. Griffin service will cache the hive table metadata, and refresh it > every 15 minutes.* > > The first refresh happens when start up: > > 2018-04-05 09:32:23.248 INFO 106074 --- [pool-4-thread-1] > o.a.g.c.m.hive.HiveMetaStoreService > : Evict hive cache > > > > *But it fails by this error:* > > 2018-04-05 09:32:23.260 ERROR 106074 --- [pool-4-thread-1] hive.log > : Got exception: > org.apache.thrift.transport.TTransportException > java.net.SocketException: Broken pipe (Write failed) > > org.apache.thrift.transport.TTransportException: > java.net.SocketException: Broken pipe (Write failed) > > > > *Griffin service logs this error in the cache refresh process, now the > cache is evicted but new data fetch fails:* > > 2018-04-05 09:32:23.263 ERROR 106074 --- [pool-4-thread-1] > o.a.g.c.m.hive.HiveMetaStoreService > : Can not get databases : Got exception: > org.apache.thrift.transport.TTransportException > java.net.SocketException: Broken pipe (Write failed) > > 2018-04-05 09:32:23.263 INFO 106074 --- [pool-4-thread-1] > o.a.g.c.m.hive.HiveMetaStoreService > : After evict hive cache,automatically refresh hive tables cache. > > > > *3. Then griffin service will try to reconnect to hive metastore > asynchronously, but every time it tries to connect, the same error occurs:* > > 2018-04-05 09:32:23.269 INFO 106074 --- [pool-3-thread-1] hive.metastore > : Trying to connect to metastore with URI thrift:// > azudpoc2928.ent.lolcentral.com:9083 > <https://apac01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fazudpoc2928.ent.lolcentral.com%3A9083&data=01%7C01%7Ckaran.gupta%40tavant.com%7C8864aa5f9b3c45b8737d08d59f699ca5%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=y42fj2YVC%2F0byf%2Bph8jX%2FTtiiS5%2FOaTKD2nAHNikyN4%3D&reserved=0> > > 2018-04-05 09:32:23.279 WARN 106074 --- [pool-3-thread-1] hive.metastore > : set_ugi() not successful, Likely cause: new > client talking to old server. Continuing without it. > > org.apache.thrift.transport.TTransportException: > java.net.SocketException: Connection reset > > > > But it also succeed to connect: > > 2018-04-05 09:32:23.280 INFO 106074 --- [pool-3-thread-1] hive.metastore > : Connected to metastore. > > > > *4. And after 15 minutes, the same things happen again.* > > > > > > I think there are two problems we need to investigate: > > > > *1. Every time trying to connect hive metastore, error occurs but succeed > immediately. * > > I've googled this error message and found this: > https://community.hortonworks.com/questions/146939/ > extration-warn-hivemetastore-set-ugi-not-successfu.html > <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommunity.hortonworks.com%2Fquestions%2F146939%2Fextration-warn-hivemetastore-set-ugi-not-successfu.html&data=01%7C01%7Ckaran.gupta%40tavant.com%7C8864aa5f9b3c45b8737d08d59f699ca5%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=ZECpEACGnlS6GmbsDmaqf2M%2BdliNPD1tJCB%2BCbDdj70%3D&reserved=0> > > Seems like too many client connections to hive metastore service. > > > > *2. Every time griffin evict cache and try to fetch new data using the > built connection last time, it was broken pipe, seems like the connection > lasts too short. * > > I wonder it could be solved by some configuration of hive metastore > service, or it's also caused by too many connections. > > > > Could you check about this? Hope it helps. > > > > > > Thanks, > > Lionel > > > > > > > > > > > > On Tue, Apr 10, 2018 at 8:57 PM, Karan Gupta <[email protected]> > wrote: > > Hi Lionel, > > > > hive.metastore.uris is correctly set as per my knowledge in the > application.properties. Could you suggest any alternative or a work around? > > > > > > Thank you, > > Karan Gupta > > > > *From:* Lionel, Liu <[email protected]> > *Sent:* Tuesday, April 10, 2018 6:00 PM > *To:* Karan Gupta <[email protected]>; [email protected]. > org > *Subject:* RE: Griffin Service Error > > > > Hi Karan, > > > > It seems like connect hive metastore service fails, you need to configure > "hive.metastore.uris” as the correct one in application.properties. > > > > Thanks > Lionel, Liu > > > > *From: *Karan Gupta <[email protected]> > *Sent: *2018年4月10日 17:59 > *To: *Lionel, Liu <[email protected]> > *Subject: *Griffin Service Error > > > > Hi Lionel, > > > > I am encountering the below error when I try to run the griffin service jar > > > > > > > > Any guidance would be very helpful. > > > > Thank you, > > Karan Gupta > > *From:* Vinod Raina > *Sent:* Monday, April 9, 2018 2:18 PM > *To:* Lionel, Liu <[email protected]> > *Cc:* Karan Gupta <[email protected]> > *Subject:* RE: Few Questions about Griffin > > > > Thank you Lionel, this information helps J .. > > > > > > > > *Regards* > > *Vinod Raina* | [email protected] > > Associate Technical Architect > > M: +91 9711022965 > > > > *From:* Lionel, Liu <[email protected]> > *Sent:* Saturday, April 7, 2018 1:32 PM > *To:* Vinod Raina <[email protected]>; [email protected]. > org > *Cc:* Karan Gupta <[email protected]> > *Subject:* RE: Few Questions about Griffin > > > > Hi Vinod, > > > > For the first question, it looks like the validity dimension, to measure > the data item by the rules defined. The validity dimension has not been > implemented in griffin, but you can also make it work by profiling at > current. For example, you can define the profiling rule as “select > count(*) from source where len(telephone) = 10 and name is not null”, > that will produce the count of items matched such a rule, with another > metric as total count, then you’ll get the percentage. In fact, getting > the count metrics is better than getting the percentage directly. > > For the second question, I’m not very familiar with Kerberos, but in > eBay, we’re also using hdfs cluster with Kerberos authentication. Griffin > measure module works as a spark application, and it supports all the spark > parameters, so it should work in the same way like you submit other spark > applications on your cluster. If not correct pls tell me, thanks. > > > > Thanks > Lionel, Liu > > > > *From: *Vinod Raina <[email protected]> > *Sent: *2018年4月5日 13:09 > *To: *Lionel Liu <[email protected]>; [email protected] > *Cc: *Karan Gupta <[email protected]> > *Subject: *RE: Few Questions about Griffin > > > > Thank you Lionel, > > I have 2 more follow queries : > > 1. My requirement is to check the data quality in terms of whether the > data confirms to the data types that I expect it to be. E.g One column may > have telephone number, so I expect it to be 10 digit number , another > column is birthdate, so I expect it to be in a date format or there is a > name column and I don’t want it to be null/missing. So I need to create a > metric report where I can get to see the percentage of data that confirms > to the validations that we have created. Can griffin do that ? > 2. Also, Our HDFS is a kerberised cluster. Can griffin work on a > kerberised cluster ? > > > > > > > > *Regards* > > *Vinod Raina* | [email protected] > > Associate Technical Architect > > M: +91 9711022965 > > > > *From:* Lionel Liu <[email protected]> > *Sent:* Tuesday, April 3, 2018 2:16 PM > *To:* [email protected]; Vinod Raina < > [email protected]> > *Cc:* Karan Gupta <[email protected]> > *Subject:* Re: Few Questions about Griffin > > > > Hi Vinod, > > > > We're glad to receive your email, there're some other documents of Griffin > listed below: > > wiki: https://cwiki.apache.org/confluence/display/GRIFFIN/Apache+Griffin > <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FGRIFFIN%2FApache%2BGriffin&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=K1%2Be1%2F%2F3xdxV7Y9HMDwAeOS3Us6x1L2lGw6hD1WcdGg%3D&reserved=0> > > github: https://github.com/apache/incubator-griffin/tree/master/ > griffin-doc > <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Ftree%2Fmaster%2Fgriffin-doc&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=XsJDny0l9frweakqLEMPMpTgtLCdJWBer59QcDaIi%2Bk%3D&reserved=0> > > And you can follow https://github.com/apache/ > incubator-griffin/blob/master/griffin-doc/docker/griffin-docker-guide.md > <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Fblob%2Fmaster%2Fgriffin-doc%2Fdocker%2Fgriffin-docker-guide.md&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=gV%2FwnKgcBn3CaphB636zwFz5llJPMOuKmxlgQE0Oqf0%3D&reserved=0> > to try griffin docker image. > > > > For your questions, I'll list my answers: > > > > *1. What is the usage of accuracy metric? In what situations, it will be > useful?* > > > > Accuracy measures the match percentage between two data sources, we call > them "target" and "source", "source" is the data source you trust, "target" > is the data source you want to check. > > For example, say "source" is [1, 2, 3, 4, 5], while "target" is [1, 3, 5, > 7, 9], we'll get the accuracy #(target items matched in source) / #(all > target items) = 3/5 = 80%. Actually, "exactly match" is a narrow concept, > in accuracy, we say "pass the match rule", users can define their own > "match rule" like "source.age <= target.age AND upper(source.city) = > upper(target.city)" instead of "exactly match". > > When we have a data source we trust, let it be the "source", then we can > measure accuracy of another data source named "target", to figure out how > correctly we can trust. > > > > There's a standard use case: > > In our data pipeline, when we get users' data from site, we persist it as > table T1, which we trust it as the source of truth. On the other hand, a > copy of users' data will be pushed to some streaming or batch processes, > after some steps, the processed data is persisted as table T2, we want to > know how correct it is, or how much we can trust it. > > Set T1 as "source", T2 as "target", we can get the accuracy of T2, with > the wrong items from T2 persisted. > > > > And another specific use case: > > We have a streaming data process system, it consumes data from input and > produces to output. In each output data item, it also contains the key of > input item, we want to know how much data is successfully processed. > > Set output as "source", input as "target", we can get the accuracy of > input, and the missing items from input will be persisted. > > Actually, this case measures the completeness of output, but it works like > reversed accuracy, so we can use it like this. > > > > However, in griffin measure configuration, the concept of source and > target are based on the code implementation, which is different from the > business concept above. In the documents of measure configuration, we're > measuring accuracy of "source". > > We are planning to modify the code implementation to be align with the > business concept later, by then, we'll highlight it in the release notes. > > > > > > *2. Can we run other metrics using command-line? (or) Is only accuracy > metric supported at the moment?* > > > > Yes, you can just run griffin measure module using cmd-line directly, like > this: https://github.com/bhlx3lyx7/griffin-docker/blob/master/ > svc_msr_new/prep/measure/start-accu.sh > <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fbhlx3lyx7%2Fgriffin-docker%2Fblob%2Fmaster%2Fsvc_msr_new%2Fprep%2Fmeasure%2Fstart-accu.sh&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=Tv0zkOEV3gy0sZXo3HJeZ6%2BG3qw1qGXEbAt1O0VAr1k%3D&reserved=0> > . > > At current, griffin UI module doesn't support all the dimensions, but > measure module supports accuracy, profiling, timeliness and uniqueness, you > can get some description of them here: https://github.com/apache/ > incubator-griffin/blob/master/griffin-doc/measure/dsl-guide. > md#griffin-dsl-translation-to-sql > <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Fblob%2Fmaster%2Fgriffin-doc%2Fmeasure%2Fdsl-guide.md%23griffin-dsl-translation-to-sql&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=SK4UDSZabQmU4215b9WSPY75qm5fcf5Ed%2BbjJGjWwdQ%3D&reserved=0> > . > > > > > > *3. Project roadmap for features?* > > > > The project roadmap is out of date, we've updated it: > https://cwiki.apache.org/confluence/display/GRIFFIN/0.+Roadmap > <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FGRIFFIN%2F0.%2BRoadmap&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=2W2bVULf8eJQboeUuV8%2BRNKyyP84%2BANEo1sCAoMrGlM%3D&reserved=0> > > Some new features we're planning in the short term planning: > > - streaming measure job schedule. > > - more data quality dimensions support, such as completeness, consistency, > validity. > > And for long term, maybe including: > > - more data sources support, such as RDBs, elasticsearch. > > - anomaly detection support. > > - spark 2 support. > > > > > > *4. Can we use create custom Rules and profile existing data?* > > > > Yes, you can create custom rules for your data, according to the > documents: https://github.com/apache/incubator-griffin/blob/master/ > griffin-doc/measure/measure-configuration-guide.md > <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Fblob%2Fmaster%2Fgriffin-doc%2Fmeasure%2Fmeasure-configuration-guide.md&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=IWI0GSPaDRWkqJ3mj%2B%2FtvP7tGq0BnqJp8RUNeQt%2FnTg%3D&reserved=0> > and https://github.com/apache/incubator-griffin/blob/master/ > griffin-doc/measure/measure-batch-sample.md > <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Fblob%2Fmaster%2Fgriffin-doc%2Fmeasure%2Fmeasure-batch-sample.md&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=ZLXrMHoB2TGto5H2f9dnERtvenhxE3b1qnwiFQIi7UA%3D&reserved=0> > . > > The profiling rule supports simple spark-sql syntax directly, as > https://github.com/apache/incubator-griffin/blob/master/ > griffin-doc/measure/dsl-guide.md#profiling > <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Fblob%2Fmaster%2Fgriffin-doc%2Fmeasure%2Fdsl-guide.md%23profiling&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=EUnidToEM3LsPbP%2Fi7UjQZMT1Hmi6HGVoEfPH7e1574%3D&reserved=0> > described. > > If you want to use spark-sql, you can also define the rules like this: > https://github.com/apache/incubator-griffin/blob/master/ > griffin-doc/measure/dsl-guide.md#spark-sql > <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-griffin%2Fblob%2Fmaster%2Fgriffin-doc%2Fmeasure%2Fdsl-guide.md%23spark-sql&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=HHhEqZn6e1bpzPV7BuSQ7CpaahKDXNZsbJBLiTU1cGc%3D&reserved=0> > . > > > > > > *5. Postgresql and mysql -- both listed in Prerequisites. We have MySQL, > Is that enough?* > > > > In fact, you can choose either one of postgresql and mysql. > > We use mysql for the measure and schedule persistance before, but due to > the license issue of release, we have to switch to postgresql these days. > > If you want to use mysql, you need to modify some dependencies in service > module and the application.properties file, rebuild the service.jar as well. > > We are going to place a document to help users for mysql or other db. > > > > > > Hope this helps you, please feel free if any question. > > > > Thanks, > > Lionel > > > > On Tue, Apr 3, 2018 at 1:41 PM, Vinod Raina <[email protected]> > wrote: > > Hi Griffin team, > In our team, We are looking to create a Data Quality model for your EDL > Ingestion and are exploring Apache Griffin for it. We have gone through the > documentation. The documentation is still not complete but we understand > that the project is in incubation and there might be other reasons as well. > It would be really helpful if there is any other source of information > (other than the apache portal and the git hub readme ) which can help us > to understand the usage of this framework. > Also ,we have below few question and would really if you can help us with > the answers : > > 1. What is the usage of accuracy metric? In what situations, it will be > useful? > 2. Can we run other metrics using command-line? (or) Is only accuracy > metric supported at the moment? > 3. Project roadmap for features? > 4. Can we use create custom Rules and profile existing data? > 5. Postgresql and mysql -- both listed in Prerequisites. We have MySQL, Is > that enough? > > > > > Regards > Vinod Raina | [email protected]<mailto:[email protected]> > Associate Technical Architect > M: +91 9711022965 > Tavant Technologies | www.tavant.com > <https://apac01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tavant.com&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=MnrwgHWIuurIvvm8WmPkmNwvkZV9mmfQXpb8ng9H8ug%3D&reserved=0> > <http://www.tavant.com/ > <https://apac01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.tavant.com%2F&data=01%7C01%7Cvinod.raina%40tavant.com%7C99770c25b3bf4350c15a08d5993f6711%7Cc6c1e9da5d0c4f8f9a023c67206efbd6%7C0&sdata=5wRwGG4m3wzw8JFXIWyg%2B4a3GYZXSUV5iElBegNjGfY%3D&reserved=0> > > > Okaya Centre, Tower 1, 5th Floor,B-5, Sector 62, Noida, UP 201 309 > > ________________________________ > Any comments or statements made in this email are not necessarily those of > Tavant Technologies. The information transmitted is intended only for the > person or entity to which it is addressed and may contain confidential > and/or privileged material. If you have received this in error, please > contact the sender and delete the material from any computer. All emails > sent from or to Tavant Technologies may be subject to our monitoring > procedures. > > > > > > > > >
