Hi Karan, For HTTP persistence, are the metrics persisted directly from “Spark”? (or) Griffin services writes into it? [Answer] The metrics are persisted directly from spark application.
Our URL is like this: http://localhost:9200/griffin/accuracy (if it is from Griffin service, it will work…. But from Spark executors, it wont work as localhost resolves to executor host) [Answer] I think you can modify "localhost" to the ip address of ES. But we have not created any index in “ES” called “griffin” or “accuracy”….? What should we be doing here? [Answer] You don't need to create the indices in ES, ES will create it when post metrics to it. For the "email" and "sms" parameters, they are not enabled in this version, you can just ignore them in env.json. BTW, has the metrics been persisted on HDFS? Thanks, Lionel On Fri, May 4, 2018 at 2:24 PM, Karan Gupta <[email protected]> wrote: > Hi, > > > > Thank you for the detail. > > > > In env.json, we have specified both HDFS and HTTP. > > For HTTP persistence, are the metrics persisted directly from “Spark”? > (or) Griffin services writes into it? > > Our URL is like this: http://localhost:9200/griffin/accuracy (if it is > from Griffin service, it will work…. But from Spark executors, it wont work > as localhost resolves to executor host) > > But we have not created any index in “ES” called “griffin” or > “accuracy”….? What should we be doing here? > > > > One more: > > > > Yesterday we found that “email” and “sms” parts of the env.json are not > configured properly. > > They appear as “array” in JSON… but the “EmailParam” and “SmsParam” do not > expect a List… > > This was causing Spark jobs not to launch. > > We edited the env.json accordingly…. We hope we did the right thing… > > Can you confirm this? > > > > Thank you, > > Karan Gupta > > > > *From:* Lionel Liu <[email protected]> > *Sent:* Friday, May 4, 2018 11:46 AM > *To:* Karan Gupta <[email protected]> > *Cc:* [email protected] > *Subject:* Re: No Index Formation in Elastic Search > > > > Hi Karan, > > > > First, we need to check has griffin successfully finished. What persist > types did you configure in env.json? "log", "hdfs", "http"? > > - "log": print the metrics in application log. > > - "hdfs": the metrics will be persisted in hdfs path you've set. > > - "http": post the metrics to the "api" you've set, which should be the > elasticsearch endpoint by default. > > > > You can choose multiple of them. > > If "http" is not configured correctly, post metrics to ES fails. > > If "hdfs" is configured, but you can not get any metric persisted in the > "path", maybe griffin has not finish the calculation correctly. > > If "log" is configured, you can get the application log from yarn: > > yarn logs -applicationId <appId> > applog > > Then read the applog, find if there's any output metric calculated. > > If there's no metric persisted by any type of your persist configuration, > you need to read the applog, and find the error message. Then you can show > it to me, I'll help you find it. > > > > Thanks, > > Lionel > > > > > > On Fri, May 4, 2018 at 2:00 PM, Karan Gupta <[email protected]> > wrote: > > Hi Lionel, > > > > While the Spark Application gets finished, I do not see any Index getting > created in the elastic search, hence I do not see the data quality metrics > getting populated. > > Could you help me out with a possible solution? > > > > > > Thank you, > > Karan Gupta > ------------------------------ > > Any comments or statements made in this email are not necessarily those of > Tavant Technologies. The information transmitted is intended only for the > person or entity to which it is addressed and may contain confidential > and/or privileged material. If you have received this in error, please > contact the sender and delete the material from any computer. All emails > sent from or to Tavant Technologies may be subject to our monitoring > procedures. > > >
