Re: Metron MaaS deployment issues

2019-12-10 Thread Hema malini
Hi,

I am facing the same issue even after increasing the memory.Can you please
help because in the log i could session timeout .

Thanks and Regards,
Hema

On Tue, 10 Dec, 2019, 12:34 AM Hema malini,  wrote:

> Hi,
>
> I followed the instructions in the below link -
>
> https://community.cloudera.com/t5/Community-Articles/Metron-Model-as-a-Service-Maas-full-dev-platform/ta-p/247394
> and also Metron git .I was able to deploy the sample for a datasource. Now
> i tried to deploy the pretrained model in pytorch and trying to predict for
> incoming data stream. When i execute the rest.sh script , i am getting the
> proper response.Rest endpoint is working fine. When i try to deploy the
> same in MaaS ( rest.sh, python file, model.pkl file) , i am getting error.
> I checked the application container log in yarn and i could see only below
> error. What could bethe issue?
>
> Logs
> ---
>
> 19/12/09 20:03:39 INFO callback.LaunchContainer: deeplog_class.py localized: 
> scheme: "hdfs" host: "localhost.localdomain" port: 8020 file: 
> "/user/metron/model/deeplog_class.py"
> 19/12/09 20:03:39 INFO callback.LaunchContainer: rest.sh localized: scheme: 
> "hdfs" host: "localhost.localdomain" port: 8020 file: 
> "/user/metron/model/rest.sh"
> 19/12/09 20:03:39 INFO callback.LaunchContainer: mapper localized: scheme: 
> "hdfs" host: "localhost.localdomain" port: 8020 file: 
> "/user/metron/model/mapper"
> 19/12/09 20:03:39 INFO callback.LaunchContainer: model localized: scheme: 
> "hdfs" host: "localhost.localdomain" port: 8020 file: 
> "/user/metron/model/model"
> 19/12/09 20:03:39 INFO callback.LaunchContainer: deep_log.py localized: 
> scheme: "hdfs" host: "localhost.localdomain" port: 8020 file: 
> "/user/metron/model/deep_log.py"
> 19/12/09 20:03:39 INFO callback.LaunchContainer: Executing container command: 
> {{JAVA_HOME}}/bin/java org.apache.metron.maas.service.runner.Runner -ci 
> 15393162788866 -zq localhost.localdomain:2181 -zr /metron/maas/config -s 
> rest.sh -n dl_model_11 -hn localhost.localdomain -v 1.0 1>/stdout 
> 2>/stderr
> 19/12/09 20:03:40 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_e14_1575839541200_0017_01_02
> 19/12/09 20:03:40 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> localhost.localdomain:45454
> 19/12/09 20:03:40 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> QUERY_CONTAINER for Container container_e14_1575839541200_0017_01_02
> 19/12/09 20:03:40 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> localhost.localdomain:45454
> 19/12/09 20:04:14 ERROR service.ApplicationMaster: Received a null request...
>
>
>
>
>
> Thanks and regards,
> Hema
>


How can i send batch of data to MaaS

2019-12-10 Thread Hema malini
Hi,

Is there any way to pass a batch of data to Metron MaaS. We have some
models like LSTM, which requires data to be aggregated and passed to model
.Can you please suggest whether is it possible.

Thanks and Regards,
Hema


Issue: reindexing of some events on parsers restart

2019-12-10 Thread Vladimir Mikhailov
Hi 

We found the unpleasant consequences of each restart of the parsers: each time 
part of the events are reindexed again. Unfortunately, this was confirmed by 
several special tests.

Perhaps the reason for this is the method used to immediately stop the storm 
topology using "killTopologyWithOpts" with the option "set_wait_secs (0)". 
Because of this, the topology does not have time to commit to kafka the current 
offsets of already processed events.

After the parser starts, kafkaSpout starts reading uncommitted events and 
therefore some events are indexed twice.

So the question is: is there a more elegant way to stop the parser topology in 
order to avoid the problems described above? Of course, we are talking about 
changes to the source code, not some options or settings.

If such a solution exists and the problem can be fixed, then I can create the 
corresponding issue at https://issues.apache.org/jira/browse/METRON


Seeking comments - Solr and Elasticsearch

2019-12-10 Thread Yerex, Tom
Good afternoon,


I’m fishing for some insight and experience, hopefully someone has a strong 
opinion and is willing to share.

 

We are currently exploring the indexing options available in Metron. From what 
I can gather Elasticsearch has a great marketing budget and Solr has some large 
organizations using it such as Walmart, but they both are essentially the same 
thing under the hood. I see the latest version of Elastic is moving into SIEM 
territory, which troubles me as I like a product with focus, I appreciate that 
Solr seems to be focussed on doing what it does and only that.

 

We use Elasticsearch in another log-related project here, it is a bit of a 
love-hate relationship but overall the product works well with proper planning 
and care. Solr has never been used before, but I personally like the interface 
and it has the feel of a technically challenging but somehow more mature 
product. We are not particularily invested in one solution over the other and 
any comparison so far has been fairly superficial.

 

Something in my gut suggests to me that we may be better off using Solr, but I 
can’t quite pinpoint my reason on a technical level. Has anyone been 
considering these options and had some insight or a good reason to choose one 
over the other? Perhaps you found a good reason to run both?

 

Thank you,

 

Tom.



smime.p7s
Description: S/MIME cryptographic signature


Re: Issue: reindexing of some events on parsers restart

2019-12-10 Thread Vladimir Mikhailov
Hi Michael

I think the problem is not on the REST side, but in the "StormCLIWrapper", 
which it uses:

https://github.com/apache/metron/blob/88f4d2cefe4bbb389732da3b4f5cbcf02b7b949a/metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/StormCLIWrapper.java#L145

Each of the "StormCLIWrapper" methods: stopParserTopology, 
stopEnrichmentTopology and stopIndexingTopology simply stop the corresponding 
topologies with command "storm kill  [-w 0]", leading to the described 
unpleasant consequences with re-indexing.

Perhaps, instead, we should give the topology a certain command to stop and 
wait until it finishes processing current events and commits changes to kafka?


On 2019/12/10 18:18:28, Michael Miklavcic  wrote: 
> Where are you seeing this? As far as I can tell, the UI and REST endpoints
> default to a graceful shutdown.
> https://github.com/apache/metron/blob/master/metron-interface/metron-config/src/app/service/storm.service.ts#L154
> https://github.com/apache/metron/blob/master/metron-interface/metron-rest/src/main/java/org/apache/metron/rest/controller/StormController.java#L91
> 
> 
> On Tue, Dec 10, 2019 at 4:11 AM Vladimir Mikhailov <
> v.mikhai...@content-media.ru> wrote:
> 
> > Hi
> >
> > We found the unpleasant consequences of each restart of the parsers: each
> > time part of the events are reindexed again. Unfortunately, this was
> > confirmed by several special tests.
> >
> > Perhaps the reason for this is the method used to immediately stop the
> > storm topology using "killTopologyWithOpts" with the option "set_wait_secs
> > (0)". Because of this, the topology does not have time to commit to kafka
> > the current offsets of already processed events.
> >
> > After the parser starts, kafkaSpout starts reading uncommitted events and
> > therefore some events are indexed twice.
> >
> > So the question is: is there a more elegant way to stop the parser
> > topology in order to avoid the problems described above? Of course, we are
> > talking about changes to the source code, not some options or settings.
> >
> > If such a solution exists and the problem can be fixed, then I can create
> > the corresponding issue at https://issues.apache.org/jira/browse/METRON
> >
> 


Re: Issue: reindexing of some events on parsers restart

2019-12-10 Thread Michael Miklavcic
It only does that if the arg stopNow is true. It's always false per the
previous snippets I shared.

On Tue, Dec 10, 2019, 10:54 PM Vladimir Mikhailov <
v.mikhai...@content-media.ru> wrote:

> Hi Michael
>
> I think the problem is not on the REST side, but in the "StormCLIWrapper",
> which it uses:
>
>
> https://github.com/apache/metron/blob/88f4d2cefe4bbb389732da3b4f5cbcf02b7b949a/metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/StormCLIWrapper.java#L145
>
> Each of the "StormCLIWrapper" methods: stopParserTopology,
> stopEnrichmentTopology and stopIndexingTopology simply stop the
> corresponding topologies with command "storm kill  [-w 0]", leading
> to the described unpleasant consequences with re-indexing.
>
> Perhaps, instead, we should give the topology a certain command to stop
> and wait until it finishes processing current events and commits changes to
> kafka?
>
>
> On 2019/12/10 18:18:28, Michael Miklavcic 
> wrote:
> > Where are you seeing this? As far as I can tell, the UI and REST
> endpoints
> > default to a graceful shutdown.
> >
> https://github.com/apache/metron/blob/master/metron-interface/metron-config/src/app/service/storm.service.ts#L154
> >
> https://github.com/apache/metron/blob/master/metron-interface/metron-rest/src/main/java/org/apache/metron/rest/controller/StormController.java#L91
> >
> >
> > On Tue, Dec 10, 2019 at 4:11 AM Vladimir Mikhailov <
> > v.mikhai...@content-media.ru> wrote:
> >
> > > Hi
> > >
> > > We found the unpleasant consequences of each restart of the parsers:
> each
> > > time part of the events are reindexed again. Unfortunately, this was
> > > confirmed by several special tests.
> > >
> > > Perhaps the reason for this is the method used to immediately stop the
> > > storm topology using "killTopologyWithOpts" with the option
> "set_wait_secs
> > > (0)". Because of this, the topology does not have time to commit to
> kafka
> > > the current offsets of already processed events.
> > >
> > > After the parser starts, kafkaSpout starts reading uncommitted events
> and
> > > therefore some events are indexed twice.
> > >
> > > So the question is: is there a more elegant way to stop the parser
> > > topology in order to avoid the problems described above? Of course, we
> are
> > > talking about changes to the source code, not some options or settings.
> > >
> > > If such a solution exists and the problem can be fixed, then I can
> create
> > > the corresponding issue at
> https://issues.apache.org/jira/browse/METRON
> > >
> >
>


Re: How can i send batch of data to MaaS

2019-12-10 Thread Simon Elliston Ball
If you’re looking to send sequences to an LSTM model, you are probably
looking for the profiler, which can assemble sequential features such as
those that would go into an LSTM. You would then use the triage output
method from the profiler to pass a stream of batches to MaaS.

Simon

On Tue, 10 Dec 2019 at 16:16, Hema malini  wrote:

> Thanks Otto for the confirmation.
>
> On Tue, 10 Dec, 2019, 8:46 PM Otto Fowler, 
> wrote:
>
>> As Metron is a streaming system, it doesn’t send batches as part of
>> normal in flow operation. MAAS is called through stellar, which operates on
>> a per message basis.
>>
>> The batching we *do* have is at the termination of the stream, at the
>> indexing where we batch writes out of the pipeline. This won’t help you
>> with stellar however.
>>
>>
>>
>>
>> On December 10, 2019 at 09:39:27, Hema malini (nhemamalin...@gmail.com)
>> wrote:
>>
>> Hi,
>>
>> Is there any way to pass a batch of data to Metron MaaS. We have some
>> models like LSTM, which requires data to be aggregated and passed to model
>> .Can you please suggest whether is it possible.
>>
>> Thanks and Regards,
>> Hema
>>
>> --
--
simon elliston ball
@sireb


Re: How can i send batch of data to MaaS

2019-12-10 Thread Hema malini
Thanks Simon..will explore Metron profiler

On Tue, 10 Dec, 2019, 9:00 PM Simon Elliston Ball, <
si...@simonellistonball.com> wrote:

> If you’re looking to send sequences to an LSTM model, you are probably
> looking for the profiler, which can assemble sequential features such as
> those that would go into an LSTM. You would then use the triage output
> method from the profiler to pass a stream of batches to MaaS.
>
> Simon
>
> On Tue, 10 Dec 2019 at 16:16, Hema malini  wrote:
>
>> Thanks Otto for the confirmation.
>>
>> On Tue, 10 Dec, 2019, 8:46 PM Otto Fowler, 
>> wrote:
>>
>>> As Metron is a streaming system, it doesn’t send batches as part of
>>> normal in flow operation. MAAS is called through stellar, which operates on
>>> a per message basis.
>>>
>>> The batching we *do* have is at the termination of the stream, at the
>>> indexing where we batch writes out of the pipeline. This won’t help you
>>> with stellar however.
>>>
>>>
>>>
>>>
>>> On December 10, 2019 at 09:39:27, Hema malini (nhemamalin...@gmail.com)
>>> wrote:
>>>
>>> Hi,
>>>
>>> Is there any way to pass a batch of data to Metron MaaS. We have some
>>> models like LSTM, which requires data to be aggregated and passed to model
>>> .Can you please suggest whether is it possible.
>>>
>>> Thanks and Regards,
>>> Hema
>>>
>>> --
> --
> simon elliston ball
> @sireb
>


Re: Issue: reindexing of some events on parsers restart

2019-12-10 Thread Michael Miklavcic
Where are you seeing this? As far as I can tell, the UI and REST endpoints
default to a graceful shutdown.
https://github.com/apache/metron/blob/master/metron-interface/metron-config/src/app/service/storm.service.ts#L154
https://github.com/apache/metron/blob/master/metron-interface/metron-rest/src/main/java/org/apache/metron/rest/controller/StormController.java#L91


On Tue, Dec 10, 2019 at 4:11 AM Vladimir Mikhailov <
v.mikhai...@content-media.ru> wrote:

> Hi
>
> We found the unpleasant consequences of each restart of the parsers: each
> time part of the events are reindexed again. Unfortunately, this was
> confirmed by several special tests.
>
> Perhaps the reason for this is the method used to immediately stop the
> storm topology using "killTopologyWithOpts" with the option "set_wait_secs
> (0)". Because of this, the topology does not have time to commit to kafka
> the current offsets of already processed events.
>
> After the parser starts, kafkaSpout starts reading uncommitted events and
> therefore some events are indexed twice.
>
> So the question is: is there a more elegant way to stop the parser
> topology in order to avoid the problems described above? Of course, we are
> talking about changes to the source code, not some options or settings.
>
> If such a solution exists and the problem can be fixed, then I can create
> the corresponding issue at https://issues.apache.org/jira/browse/METRON
>