How are you folks getting over the learning curves associated with things like 
Nifi and AirFlow ?

> On May 28, 2016, at 9:50 AM, Suneel Marthi <[email protected]> wrote:
> 
> Debo,
> 
> On Tue, May 17, 2016 at 9:18 PM, Andrew Palumbo <[email protected]> wrote:
> 
>> We are certainly interested in  online clustering Algorithms, and
>> clustering of timeseries seems like a great fit.  (our text vectorization
>> pipeline has not yet been reworked for the new Mahout "Samsara" but that is
>> an interest too).  What type of compute platform would you require for this?
>> 
> 
> For data processing pipeline, the requirements are :
>    (A) it should be agnostic to any distributed processing engine like
> Spark, Flink, etc.
>    (b) should be able to scale data pipelines and be able to support back
> pressure.
>    (c) should be able to ingest both Batch and Streaming data from Spark,
> Flink, Beam etc...
> 
>   So far Apache NiFi seems to fit the bill for all of the above criteria
> (they don't have a Beam interface yet but is being worked on) and they also
> have an excellent GUI along with features to define common workflow
> templates that could be imported into custom workflows.
> 
> The other alternatives being considered are Airbnb's Airflow - proposed for
> Apache incubator and defines workflows as a DAG in python,
> Apache Beam.
> 
> 
> 
>> 
>> Currently we are not looking at FPGAs.
>> 
> 
> If any of the Math packages handle FPGAs natively out-of-the-box, let's go
> for it. But we need not optimize the heck to get the last bit of
> performance from FPGAs.
> 
> 
>> 
>> The most recent, and only real Documentation for Mahout Samsara is in
>> Apache Mahout: Beyond MapReduce:
>> 
>> 
>> http://www.weatheringthroughtechdays.com/2016/02/mahout-samsara-book-is-out.html.
>> You may want to check that out as a reference.
>> 
>> (I'm sorry for the shameless plug but it is the only thing that cover most
>> all Mahout "Samsara" features and architecture up to our previous release)
>> 
> 
> I don't see this as a shameless plug, its definitely much better than the
> dozen low grade books that have been churned out by PackT publishers and
> went nowhere, other than bringing disrepute to the project and community.
> 
> 
>> 
>> Please do let us know if you have any questions about the Samsara platform.
>> ________________________________________
>> From: Debojyoti Dutta <[email protected]>
>> Sent: Tuesday, May 17, 2016 8:35:04 PM
>> To: [email protected]
>> Subject: Re: [NEW member] Hi
>> 
>> Thanks Andy! Would like to see if there is interest for algorithms such as
>> 1) clustering text in an online fashion (maybe using LSH or sim/min hash)
>> or 2) online clustering of time series. Basically my focus is "online" or
>> real time.
>> 
>> LSH on GPU sounds very interesting and would love to look at the patches.
>> Personally have helped accelerate LSH on TCAMs long ago e.g.
>> http://arxiv.org/abs/1006.3514 .... Is GPU the only hw accel you are
>> looking at or are you considering PCIe FPGA cards too?
>> 
>> debo
>> 
>> On Tue, May 17, 2016 at 5:27 PM, Andrew Palumbo <[email protected]>
>> wrote:
>> 
>>> Welcome, Debojyoti.
>>> We look forward to your contributiins.  We are currently working towards
>>> integrating GPU acceleration for our 0.13 release and LSH sounds like a
>>> great addition. Could you tell us some more about what you would like to
>> do?
>>> 
>>> Let us know if we can help you get familiar with the mahout code base.
>> We
>>> try to implement algorithms in the math-scala module.
>>> 
>>> Thanks,
>>> 
>>> Andy
>>> 
>>> 
>>> 
>>> 
>>> 
>>> -------- Original message --------
>>> From: Debojyoti Dutta <[email protected]>
>>> Date: 05/17/2016 8:11 PM (GMT-05:00)
>>> To: [email protected]
>>> Subject: [NEW member] Hi
>>> 
>>> Hi there,
>>> 
>>> Am very interested in contributing to Mahout especially towards fast ML
>>> kernels that can be used for streaming. Have some experience with LSH
>> based
>>> techniques (including hw accel) for clustering and near neighbors based
>>> stuff in general.
>>> 
>>> Was chatting with Sunil and he suggested I join the merry band.
>>> 
>>> regards
>>> -Debo~
>>> 
>> 
>> 
>> 
>> --
>> -Debo~
>> 

Reply via email to