Re: Fluo + Kerberos

2018-03-13 Thread Alan Camillo
We were able to run Fluo with Kerberos without big problems. Suggestions: add some properties on fluo-app to informe principal, keytab and the other necessary attributes related to Kerberos. Alter the code to authenticate properly using this parameters. If you agree, we are going to open a issue

Spark + Fluo

2018-03-13 Thread Alan Camillo
Hey fellas! Sorry to demand so much from you. But we are really trying to put Fluo to work here and we are facing some issues. Recently we decided to use Apache Spark to star the process to ingest 300 millions of lines with 62 columns each. We study this: https://fluo.apache.org/blog/2016/12/22

Re: Spark + Fluo

2018-03-13 Thread Mike Walch
Hi Alan! No worries about emailing the list. Your email is actually helpful. It's made it clear that we need to improve our troubleshooting docs. There is some documentation at the link below but we could use more: https://fluo.apache.org/docs/fluo/1.2/administration/manage-applications What ver

Re: Spark + Fluo

2018-03-13 Thread Keith Turner
On Tue, Mar 13, 2018 at 7:11 AM, Alan Camillo wrote: > Hey fellas! > Sorry to demand so much from you. But we are really trying to put Fluo to > work here and we are facing some issues. > > Recently we decided to use Apache Spark to star the process to ingest 300 > millions of lines with 62 colu

Re: Fluo + Kerberos

2018-03-13 Thread Mike Walch
Sounds good. All pull requests are welcome. On Tue, Mar 13, 2018 at 6:52 AM, Alan Camillo wrote: > We were able to run Fluo with Kerberos without big problems. > > Suggestions: add some properties on fluo-app to informe principal, keytab > and the other necessary attributes related to Kerberos.

Re: Spark + Fluo

2018-03-13 Thread Mike Walch
I opened a PR to add some troubleshooting docs to the website. https://github.com/apache/fluo-website/pull/142 On Tue, Mar 13, 2018 at 10:59 AM, Keith Turner wrote: > On Tue, Mar 13, 2018 at 7:11 AM, Alan Camillo > wrote: > > Hey fellas! > > Sorry to demand so much from you. But we are really

Re: Spark + Fluo

2018-03-13 Thread Alan Camillo
Great Mike! Thank you both for suggestions. I'll try to implement the ideas. A little bit more about the scenario: - We are using the version 1.2 of Fluo - Spark is in version 1.6 (unfortunately) with JDK 1.8 - and Accumulo in verions 1.7. When we try less messages everything goes well.

Re: Spark + Fluo

2018-03-13 Thread Alan Camillo
Another important information: - We tested with *no observers*, consequently *no workers* - We just need to know if the Loader/Oracle will hold this quantity of transactions and in how long it will take. Alan Camillo *BlueShift *I IT Director Cel.: +55 11 98283-6358 Tel.: +55 11 4605-50

Re: Spark + Fluo

2018-03-13 Thread Mike Walch
If you are running without workers, the problem is probably in your Spark/Loader process as the Oracle process is pretty simple and lightweight. If your Loader process isn't stuck (from checking it with jstack), this could be due to collisions. Fluo metrics will report the number of collisions. On

Re: Spark + Fluo

2018-03-13 Thread Alan Camillo
Our Loader class is pretty straightforward too. Here it is: class CadastralLoader(registro:org.apache.spark.sql.Row) extends org.apache.fluo.api.client.Loader { override def load(tx:org.apache.fluo.api.client.TransactionBase, context:org.apache.fluo.api.client.Loader.Context):Unit = { val ro