Re: Running MAAS in batch

2018-11-16 Thread deepak kumar
Simon,
Can you elaborate more on this:
'

*wrapped up in a batch engine like Spark to takeadvantage of more efficient
"mass" scoring.*
'
How the mass model wrapped in spark  can take advantage of mass scoring?

Thanks
Deepak

On Fri, Nov 16, 2018 at 9:15 PM Otto Fowler  wrote:

> That may be the best MAAS explanation I’ve seen Simon.
>
>
> On November 16, 2018 at 10:28:57, Simon Elliston Ball (
> si...@simonellistonball.com) wrote:
>
> MaaS is designed to wrap model inference (scoring) an event at a time, via
> a REST api. As such, running it batch doesn't make a lot of sense, since
> each message would be processed individually. Most of the models you're
> likely to run in MaaS however, are also likely to be easily batchable, and
> are probable better wrapped up in a batch engine like Spark to take
> advantage of more efficient "mass" scoring.
>
> Simon
>
> On Fri, 16 Nov 2018 at 15:18, deepak kumar  wrote:
>
>> Hi All
>> Right now MAAS supports running the model against real time events being
>> streamed into metron platform.
>> Is there any way to run the models deployed in MAAS on the batch events /
>> data that have been indexed into hdfs ?
>> If anyone have tried this batch model , please share some insights.
>> Thanks
>> Deepak.
>>
>>
>
> --
> --
> simon elliston ball
> @sireb
>
>


Re: Running MAAS in batch

2018-11-16 Thread deepak kumar
Thanks Simon and it makes perfect sense.

On Fri, 16 Nov 2018 at 8:58 PM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> MaaS is designed to wrap model inference (scoring) an event at a time, via
> a REST api. As such, running it batch doesn't make a lot of sense, since
> each message would be processed individually. Most of the models you're
> likely to run in MaaS however, are also likely to be easily batchable, and
> are probable better wrapped up in a batch engine like Spark to take
> advantage of more efficient "mass" scoring.
>
> Simon
>
> On Fri, 16 Nov 2018 at 15:18, deepak kumar  wrote:
>
> > Hi All
> > Right now MAAS supports running the model against real time events being
> > streamed into metron platform.
> > Is there any way to run the models deployed in MAAS on the batch events /
> > data that have been indexed into hdfs ?
> > If anyone have tried this batch model , please share some insights.
> > Thanks
> > Deepak.
> >
> >
>
> --
> --
> simon elliston ball
> @sireb
>


Running MAAS in batch

2018-11-16 Thread deepak kumar
Hi All
Right now MAAS supports running the model against real time events being
streamed into metron platform.
Is there any way to run the models deployed in MAAS on the batch events /
data that have been indexed into hdfs ?
If anyone have tried this batch model , please share some insights.
Thanks
Deepak.


Re: HCP in Cloud infrastructures such as AWS , GCP, AZURE

2018-10-22 Thread deepak kumar
Thanks Carolyn.
Is there any defined reference architecture to refer to?

Thanks
Deepak

On Mon, Oct 22, 2018 at 8:23 PM Carolyn Duby  wrote:

>
> Hive 3.0 works well with block stores.  You can either add it to your
> Metron cluster or spin up an ephemeral cluster with Cloudbreak:
>
> 1. Metron streams into HDFS in JSON.
> 2. Compact daily with Spark into ORC format and store in block store (S3,
> ADLS, etc).
> 3. Query ORC in block store using external Hive 3.0 tables in HDP 3 using
> LLAP.
> 4. If querying externally from block store is too slow, try adding more
> LLAP cache or load data into HDFS prior to analysis.
>
> If you are using the Metron Alerts UI, you will need solr which works well
> only on fast disk.   To keep costs down, reduce the context stored in Solr
> using the following techniques:
> 1. Only index the fields you might search on.
> 2. Reduce the formats you store in Solr to only those you will want to see
> in the Alerts UI.
> 3. Reduce the length of time you store data in Solr.
>
> Thanks
> Carolyn Duby
> Solutions Engineer, Northeast
> cd...@hortonworks.com
> +1.508.965.0584
>
> Join my team!
> Enterprise Account Manager – Boston - http://grnh.se/wepchv1
> Solutions Engineer – Boston - http://grnh.se/8gbxy41
> Need Answers? Try https://community.hortonworks.com <
> https://community.hortonworks.com/answers/index.html>
>
>
>
>
>
>
>
>
> On 10/19/18, 7:18 AM, "deepak kumar"  wrote:
>
> >Hi All
> >I have a quick question around HCP deployments in cloud infra such as AWS.
> >I am planning to run persistent cluster for all event streaming and
> >processing.
> >And then run transient cluster such as AWS EMR to run batch loads on the
> >data ingested from persistent cluster.
> >Have anyone tried this model ?
> >Since data volume is going to be humongous ,cloud is charging lot of money
> >for data io and storage.
> >Keeping this in mind , what could be the best cloud deployment of hcp
> >components assuming there is going to be ingest rate of 10TB per day .
> >
> >Thanks in advance.
> >
> >
> >Regards,
> >Deepak
>


HCP in Cloud infrastructures such as AWS , GCP, AZURE

2018-10-19 Thread deepak kumar
Hi All
I have a quick question around HCP deployments in cloud infra such as AWS.
I am planning to run persistent cluster for all event streaming and
processing.
And then run transient cluster such as AWS EMR to run batch loads on the
data ingested from persistent cluster.
Have anyone tried this model ?
Since data volume is going to be humongous ,cloud is charging lot of money
for data io and storage.
Keeping this in mind , what could be the best cloud deployment of hcp
components assuming there is going to be ingest rate of 10TB per day .

Thanks in advance.


Regards,
Deepak


Re: Change field separator in Metron to make it Hive and ORC friendly

2018-08-14 Thread deepak kumar
I agree Ali.
May be it can be configuration parameter.

On Tue, Aug 14, 2018 at 3:24 PM Ali Nazemian  wrote:

> Hi Simon,
>
> We have temporarily decided to just change it with "_" for HDFS to avoid
> all the headaches of the bugs and issues that can be raised by using
> unsupported separators for ORC/Hive and Spark. However, I am not quite
> confident with "_" as an option for the community as it becomes similar to
> normal Metron separator. Maybe it would be nice to have an ability to
> change the separator to any other character and let users decide what they
> want to use.
>
> Cheers,
> Ali
>
> On Tue, Aug 14, 2018 at 12:14 AM Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
> > Do you have any suggestions for what would make sense as a delimiter?
> >
> > On 9 August 2018 at 05:57, Ali Nazemian  wrote:
> >
> > > Hi All,
> > >
> > > I was wondering if we can change the field separators in Metron to be
> > able
> > > to make it Hive/ORC friendly. I could find the following PR, but
> neither
> > > dot nor colon is very Hive and ORC friendly and they will cause some
> > > issues. Hence, I wanted to see if it is possible to change the field
> > > separator to something else or even give users an ability to define
> what
> > > separator to be used to make the data model consistent across
> > Elasticsearch
> > > and HDFS.
> > >
> > > https://github.com/apache/metron/pull/1022
> > >
> > > Cheers,
> > > Ali
> > >
> >
> >
> >
> > --
> > --
> > simon elliston ball
> > @sireb
> >
>
>
> --
> A.Nazemian
>