That would be a step closer to something more like a micro-service
architecture. However, I would want to make sure we think about the
operational complexity, and mpack implications of having another server
installed and running somewhere on the cluster (also, ssl, kerberos, etc
etc requirements for that service).

On 8 May 2018 at 14:27, Ryan Merriman <merrim...@gmail.com> wrote:

> +1 to having metron-api as it's own service and using a gateway type
> pattern.
>
> On Tue, May 8, 2018 at 8:13 AM, Otto Fowler <ottobackwa...@gmail.com>
> wrote:
>
> > Why not have metron-api as it’s own service and use a ‘gateway’ type
> > pattern in rest?
> >
> >
> > On May 8, 2018 at 08:45:33, Ryan Merriman (merrim...@gmail.com) wrote:
> >
> > Moving the yarn classpath command earlier in the classpath now gives this
> > error:
> >
> > Caused by: java.lang.NoSuchMethodError:
> > javax.servlet.ServletContext.getVirtualServerName()Ljava/lang/String;
> >
> > I will experiment with other combinations, I suspect we will need
> > finer-grain control over the order.
> >
> > The grep matches class names inside jar files. I use this all the time
> and
> > it's really useful.
> >
> > The metron-rest jar is already shaded.
> >
> > Reverse engineering the yarn jar command was the next thing I was going
> to
> > try. Will let you know how it goes.
> >
> > On Tue, May 8, 2018 at 12:36 AM, Michael Miklavcic <
> > michael.miklav...@gmail.com> wrote:
> >
> > > What order did you add the hadoop or yarn classpath? The "shaded"
> > package
> > > stands out to me in this name "org.apache.hadoop.hbase.*shaded*
> > > .org.codehaus.jackson.jaxrs.JacksonJaxbJsonProvider." Maybe try adding
> > > those packages earlier on the classpath.
> > >
> > > I think that find command needs a "jar tvf", otherwise you're looking
> > for a
> > > class name in jar file names.
> > >
> > > Have you tried shading the rest jar?
> > >
> > > I'd also look at the classpath you get when running "yarn jar" to start
> > the
> > > existing pcap service, per the instructions in metron-api/README.md.
> > >
> > >
> > > On Mon, May 7, 2018 at 3:28 PM, Ryan Merriman <merrim...@gmail.com>
> > wrote:
> > >
> > > > To explore the idea of merging metron-api into metron-rest and
> running
> > > pcap
> > > > queries inside our REST application, I created a simple test here:
> > > > https://github.com/merrimanr/incubator-metron/tree/pcap-rest-test. A
> > > > summary of what's included:
> > > >
> > > > - Added pcap as a dependency in the metron-rest pom.xml
> > > > - Added a pcap query controller endpoint at
> > > > http://node1:8082/swagger-ui.html#!/pcap-query-controller/
> > > queryUsingGET
> > > > - Added a pcap query service that runs a simple, hardcoded query
> > > >
> > > > Generate some pcap data using pycapa (
> > > > https://github.com/apache/metron/tree/master/metron-sensors/pycapa)
> > and
> > > > the
> > > > pcap topology (
> > > > https://github.com/apache/metron/tree/master/metron-
> > > > platform/metron-pcap-backend#starting-the-topology).
> > > > After this initial setup there should be data in HDFS at
> > > > "/apps/metron/pcap". I believe this should be enough to exercise the
> > > > issue. Just hit the endpoint referenced above. I tested this in an
> > > > already running full dev by building and deploying the metron-rest
> > jar.
> > > I
> > > > did not rebuild full dev with this change but I would still expect it
> > to
> > > > work. Let me know if it doesn't.
> > > >
> > > > The first error I see when I hit this endpoint is:
> > > >
> > > > java.lang.NoClassDefFoundError:
> > > > org/apache/hadoop/yarn/webapp/YarnJacksonJaxbJsonProvider.
> > > >
> > > > Here are the things I've tried so far:
> > > >
> > > > - Run the REST application with the YARN jar command since this is
> how
> > > > all our other YARN/MR-related applications are started (metron-api,
> > > > MAAS,
> > > > pcap query, etc). I wouldn't expect this to work since we have
> > > runtime
> > > > dependencies on our shaded elasticsearch and parser jars and I'm not
> > > > aware
> > > > of a way to add additional jars to the classpath with the YARN jar
> > > > command
> > > > (is there a way?). Either way I get this error:
> > > >
> > > > 18/05/04 19:49:56 WARN reflections.Reflections: could not create Dir
> > > using
> > > > jarFile from url file:/usr/hdp/2.6.4.0-91/hadoop/lib/ojdbc6.jar.
> > > skipping.
> > > > java.lang.NullPointerException
> > > >
> > > >
> > > > - I tried adding `yarn classpath` and `hadoop classpath` to the
> > > > classpath in /usr/metron/0.4.3/bin/metron-rest.sh (REST start
> > > > script). I
> > > > get this error:
> > > >
> > > > java.lang.ClassNotFoundException:
> > > > org.apache.hadoop.hbase.shaded.org.codehaus.jackson.
> > > > jaxrs.JacksonJaxbJsonProvider
> > > >
> > > >
> > > > - I searched for the class in the previous attempt but could not find
> > > it
> > > > in full dev:
> > > >
> > > > find / -name "*.jar" 2>/dev/null | xargs grep
> > > > org/apache/hadoop/hbase/shaded/org/codehaus/jackson/
> > > > jaxrs/JacksonJaxbJsonProvider
> > > > 2>/dev/null
> > > >
> > > >
> > > > - Further up in the stack trace I see the error happens when
> > > initiating
> > > > the org.apache.hadoop.yarn.util.timeline.TimelineUtils class. I
> > > tried
> > > > setting "yarn.timeline-service.enabled" in Ambari to false and then
> I
> > > > get
> > > > this error:
> > > >
> > > > Unable to parse
> > > > '/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework'
> as
> > a
> > > > URI, check the setting for mapreduce.application.framework.path
> > > >
> > > >
> > > > - I've tried adding different hadoop, hbase, yarn and mapreduce Maven
> > > > dependencies without any success
> > > > - hadoop-yarn-client
> > > > - hadoop-yarn-common
> > > > - hadoop-mapreduce-client-core
> > > > - hadoop-yarn-server-common
> > > > - hadoop-yarn-api
> > > > - hbase-server
> > > >
> > > > I will keep exploring other possible solutions. Let me know if anyone
> > > has
> > > > any ideas.
> > > >
> > > > On Mon, May 7, 2018 at 9:02 AM, Otto Fowler <ottobackwa...@gmail.com
> >
> > > > wrote:
> > > >
> > > > > I can imagine a new generic service(s) capability whose job ( pun
> > > > intended
> > > > > ) is to
> > > > > abstract the submittal, tracking, and storage of results to yarn.
> > > > >
> > > > > It would be extended with storage providers, queue provider,
> > possibly
> > > > some
> > > > > set of policies or rather strategies.
> > > > >
> > > > > The pcap ‘report’ would be a client to that service, the
> specializes
> > > the
> > > > > service operation for the way we want pcap to work.
> > > > >
> > > > > We can then re-use the generic service for other long running yarn
> > > > > things…..
> > > > >
> > > > >
> > > > > On May 7, 2018 at 09:56:51, Otto Fowler (ottobackwa...@gmail.com)
> > > wrote:
> > > > >
> > > > > RE: Tracking v. users
> > > > >
> > > > > The submittal and tracking can associate the submitter with the
> yarn
> > > job
> > > > > and track that,
> > > > > regardless of the yarn credentials.
> > > > >
> > > > > IE> if all submittals and monitoring are by the same yarn user (
> > > Metron )
> > > > > from a single or
> > > > > co-operative set of services, that service can maintain the
> mapping.
> > > > >
> > > > >
> > > > >
> > > > > On May 7, 2018 at 09:39:52, Ryan Merriman (merrim...@gmail.com)
> > wrote:
> > > > >
> > > > > Otto, your use case makes sense to me. We'll have to think about
> how
> > to
> > > > > manage the user to job relationships. I'm assuming YARN jobs will
> be
> > > > > submitted as the metron service user so YARN won't keep track of
> > this
> > > for
> > > > > us. Is that assumption correct? Do you have any ideas for doing
> > that?
> > > > >
> > > > > Mike, I can start a feature branch and experiment with merging
> > > metron-api
> > > > > into metron-rest. That should allow us to collaborate on any issues
> > or
> > > > > challenges. Also, can you expand on your idea to manage external
> > > > > dependencies as a special module? That seems like a very attractive
> > > > option
> > > > > to me.
> > > > >
> > > > > On Fri, May 4, 2018 at 8:39 AM, Otto Fowler <
> ottobackwa...@gmail.com>
> >
> > > > > wrote:
> > > > >
> > > > > > From my response on the other thread, but applicable to the
> > backend
> > > > > stuff:
> > > > > >
> > > > > > "The PCAP Query seems more like PCAP Report to me. You are
> > > generating a
> > > > > > report based on parameters.
> > > > > > That report is something that takes some time and external
> process
> > to
> > > > > > generate… ie you have to wait for it.
> > > > > >
> > > > > > I can almost imagine a flow where you:
> > > > > >
> > > > > > * Are in the AlertUI
> > > > > > * Ask to generate a PCAP report based on some selected
> > > > alerts/meta-alert,
> > > > > > possibly picking from on or more report ‘templates’
> > > > > > that have query options etc
> > > > > > * The report request is ‘queued’, that is dispatched to be be
> > > > > > executed/generated
> > > > > > * You as a user have a ‘queue’ of your report results, and when
> > the
> > > > > report
> > > > > > is done it is queued there
> > > > > > * We ‘monitor’ the report/queue press through the yarn rest (
> > report
> > > > > > info/meta has the yarn details )
> > > > > > * You can select the report from your queue and view it either in
> > a
> > > new
> > > > > UI
> > > > > > or custom component
> > > > > > * You can then apply a different ‘view’ to the report or work
> with
> > > the
> > > > > > report data
> > > > > > * You can print / save etc
> > > > > > * You can associate the report with the alerts ( again in the
> > report
> > > > info
> > > > > > ) with…. a ‘case’ or ‘ticket’ or investigation something or other
> > > > > >
> > > > > >
> > > > > > We can introduce extensibility into the report templates, report
> > > views
> > > > (
> > > > > > thinks that work with the json data of the report )
> > > > > >
> > > > > > Something like that.”
> > > > > >
> > > > > > Maybe we can do :
> > > > > >
> > > > > > template -> query parameters -> script => yarn info
> > > > > > yarn info + query info + alert context + yarn status => report
> > info
> > > ->
> > > > > > stored in a user’s ‘report queue’
> > > > > > report persistence added to report info
> > > > > > metron-rest -> api to monitor the queue, read results ( page ),
> > etc
> > > etc
> > > > > >
> > > > > >
> > > > > > On May 4, 2018 at 09:23:39, Ryan Merriman (merrim...@gmail.com)
> > > wrote:
> > > > > >
> > > > > > I started a separate thread on Pcap UI considerations and user
> > > > > > requirements
> > > > > > at Otto's request. This should help us keep these two related but
> > > > > separate
> > > > > > discussions focused.
> > > > > >
> > > > > > On Fri, May 4, 2018 at 7:19 AM, Michel Sumbul <
> > > michelsum...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > (Youhouuu my first reply on this kind of mail chain^^)
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > If I may, I would like to share my view on the following 3
> > points.
> > > > > > >
> > > > > > > - Backend:
> > > > > > >
> > > > > > > The current metron-api is totally seperate, it will be logic
> for
> > me
> > > > to
> > > > > > have
> > > > > > > it at the same place as the others rest api. Especially when
> > more
> > > > > > security
> > > > > > > will be added, it will not be needed to do the job twice.
> > > > > > > The current implementation send back a pcap object which still
> > need
> > > > to
> > > > > > be
> > > > > > > decoded. In the opensoc, the decoding was done with tshard on
> > the
> > > > > > frontend.
> > > > > > > It will be good to have this decoding happening directly on the
> > > > backend
> > > > > > to
> > > > > > > not create a load on frontend. An option will be to install
> > tshark
> > > on
> > > > > > the
> > > > > > > rest server and to use to convert the pcap to xml and then to a
> > > json
> > > > > > that
> > > > > > > will be send to the frontend.
> > > > > > >
> > > > > > > I tried to start directly the map/reduce job to search over all
> > the
> > > > > pcap
> > > > > > > data from the rest server and as Ryan mention it, we had
> > trouble. I
> > > > > will
> > > > > > > try to find back the error.
> > > > > > >
> > > > > > > Then in the POC, what we tried is to use the pcap_query script
> > and
> > > > this
> > > > > > > work fine. I just modified it that he sends back directly the
> > > job_id
> > > > of
> > > > > > > yarn and not waiting that the job is finished. Then it will
> > allow
> > > the
> > > > > UI
> > > > > > > and the rest server to know what the status of the research by
> > > > querying
> > > > > > the
> > > > > > > yarn rest api. This will allow the UI and the rest server to be
> > > async
> > > > > > > without any blocking phase. What do you think about that?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Having the job submitted directly from the code of the rest
> > server
> > > > will
> > > > > > be
> > > > > > > perfect, but it will need a lot of investigation I think (but
> > I'm
> > > not
> > > > > > the
> > > > > > > expert so I might be completely wrong ^^).
> > > > > > >
> > > > > > > We know that the pcap_query scritp work fine so why not calling
> > it?
> > > > Is
> > > > > > it
> > > > > > > that bad? (maybe stupid question, but I really don’t see a lot
> > of
> > > > > > drawback)
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > - Front end:
> > > > > > >
> > > > > > > Adding the the pcap search to the alert UI is, I think, the
> > easiest
> > > > way
> > > > > > to
> > > > > > > move forward. But indeed, it will then be the “Alert UI and
> > > > pcapquery”.
> > > > > > > Maybe the name of the UI should just change to something like
> > > > > > “Monitoring &
> > > > > > > Investigation UI” ?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Is there any roadmap or plan for the different UI? I mean did
> > you
> > > > > > already
> > > > > > > had discussion on how you see the ui evolving with the new
> > feature
> > > > that
> > > > > > > will come in the future?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > - Microservices:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > What do you mean exactly by microservices? Is it to separate
> all
> > > the
> > > > > > > features in different projects? Or something like having the
> > > > different
> > > > > > > components in container like kubernet? (again maybe stupid
> > > question,
> > > > > but
> > > > > > I
> > > > > > > don’t clearly understand what you mean J )
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Michel
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> >
>



-- 
--
simon elliston ball
@sireb

Reply via email to