[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-07-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16538528#comment-16538528
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user merrimanr closed the pull request at:

https://github.com/apache/metron/pull/1019


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522984#comment-16522984
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1019
  
Thanks for the adjustments @merrimanr. Looks good, +1.


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16519457#comment-16519457
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user merrimanr closed the pull request at:

https://github.com/apache/metron/pull/1019


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16519458#comment-16519458
 ] 

ASF GitHub Bot commented on METRON-1555:


GitHub user merrimanr reopened a pull request:

https://github.com/apache/metron/pull/1019

METRON-1555: Update REST to run YARN and MR jobs

## Contributor Comments
This PR sets us up to run YARN and MR jobs inside our REST application.  
Changes include:
- addition of maven dependencies
- addition of -Dhdp.version parameter to the REST startup script
- MPack now supplies the hdp.version parameter
- MPack now sets up a "metron" service user HDFS directory needed for 
running MR jobs
- MPack now sets up a pcap HDFS directory 
- addition of a Pcap controller with a single Fixed Pcap Query endpoint and 
service to demonstrate running MR jobs in REST

The fixed pcap query endpoint submitted here should match the functionality 
in the metron-api module with a few minor differences:  

- the default input and output paths are spring properties instead of 
hardcoded in classes (this will make it easier to expose them in Ambari if we 
choose to)
- query results are not cleaned up automatically since that work is 
captured in a separate Jira
- num reducers is defaulted to 1 instead of 10

Unit and integration tests are included and this has been tested in full 
dev.  I tested this by generating sample pcap data with the 
PcapTopologyIntegrationTest.  You can do this by either:

- running the test in your IDE and pausing it after the topology has 
generated data
- commenting out `clearOutDir(outDir);` and running the test

Pcap data should be present in 
`/metron/metron-platform/metron-pcap-backend/target/pcap/data_dir`.  Upload the 
`pcap*` files to the `/apps/metron/pcap` directory in HDFS.  You should be able 
to perform the tests in PcapTopologyIntegrationTest using REST and get the same 
results.  

For example:
```
curl -X POST --header 'Content-Type: application/json' --header 'Accept: 
application/json' -d '{
  "endTime": 1458240269424,
  "startTime": 1458240269419
}' 'http://node1:8082/api/v1/pcap/fixed'
```
should return 2 pcap results.

```
curl -X POST --header 'Content-Type: application/json' --header 'Accept: 
application/json' -d '{}' 'http://node1:8082/api/v1/pcap/fixed'
```
should return 20 pcap results.

```
curl -X POST --header 'Content-Type: application/json' --header 'Accept: 
application/json' -d '{
  "ipDstAddr":"207.28.210.1"
}' 'http://node1:8082/api/v1/pcap/fixed'
```
should return no pcap results.

```
curl -X POST --header 'Content-Type: application/json' --header 'Accept: 
application/json' -d '{
  "ipDstPort": 22
}' 'http://node1:8082/api/v1/pcap/fixed'
```
should return 10 pcap results.

Related discussion can be found here:


http://mail-archives.apache.org/mod_mbox/metron-dev/201805.mbox/%3ccaevkqpbxzjnu_wgrbfwnz-mvqnkb7mthedveq9plyhwfit7...@mail.gmail.com%3E

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [ ] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [x] If adding new dependencies to the code, are these dependen

[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16519384#comment-16519384
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user justinleet commented on the issue:

https://github.com/apache/metron/pull/1019
  
@merrimanr Can you kick Travis?


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16519383#comment-16519383
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user justinleet commented on the issue:

https://github.com/apache/metron/pull/1019
  
I'm good with this right now with the follow-on changes and tests, assuming 
@mmiklavc is good.  I do have the same concerns Mike had, but I think being 
watchful going forward is a good compromise, but that involves a bit of 
vigilance here.


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516373#comment-16516373
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1019
  
The latest commit adds the fixed endpoint.  I updated the PR description 
with steps to test it in full dev.


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509627#comment-16509627
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1019
  
How about starting with 1 of those, fixed, and maybe making it synchronous 
in the first pass so you can integration test? I'd consider many of those 
additional api endpoints as features to be built out in this feature branch.


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508825#comment-16508825
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1019
  
Sure.  In that case we need to break up 
https://issues.apache.org/jira/browse/METRON-1559 into smaller tasks which we 
need to do that anyways.  Which endpoints would you include in a basic 
PcapService?  Maybe just these 2:

- POST /api/v1/pcap/fixed
- POST /api/v1/pcap/query

We could manually test in full dev by inspecting results in HDFS.  I still 
think a basic PcapService with a couple endpoints is going to require a large 
PR.  Just want to make sure you're ok with that.


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508702#comment-16508702
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1019
  
I guess I'm saying combine this with a basic PcapService that has an api 
somewhat flushed out - not necessarily final, but close enough to start filling 
in implementation details when new features/PR's with those features are added. 
It's effectively what you have already with just a bit more of the service 
implemented/defined.


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508690#comment-16508690
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1019
  
This was my attempt to split it up into a more manageable review.  I think 
this will end up being a large PR but that's fine with me as long as everyone 
else is ok with it.  Happy to keep going.


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508663#comment-16508663
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user mmiklavc commented on the issue:

https://github.com/apache/metron/pull/1019
  
@merrimanr Thanks for adding the discuss thread links.

It sounds like this doesn't do much aside from a pcap service. Why not just 
handle them together? That way we're not dealing with TODO's and extra lists, 
and together they would form a concise unit of work.


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505266#comment-16505266
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user merrimanr commented on the issue:

https://github.com/apache/metron/pull/1019
  
I added a link to the discuss thread to the description.  There is also a 
link in the top-level Jira:  https://issues.apache.org/jira/browse/METRON-1554.

The latest commit removes the Service classes so that only a sample 
controller remains.  I added a comment to that controller stating it should be 
replaced.  There was a Jira created for service development and API design:  
https://issues.apache.org/jira/browse/METRON-1559.  Hopefully that makes it 
clearer that API design is out of scope for this PR.  I can also add a comment 
to that Jira once this PR gets accepted.

I feel like spinning up and testing this in full dev should be part of the 
review.  Most of the changes here are Ambari and Maven related.  To test this 
properly there will need to be some way to submit a MR job (any job) from REST, 
hence the throwaway sample controller.  If you want to create your own sample 
controller and test that way I can remove the one in this PR.  Is there an 
easier option I'm not thinking of?

Maintaining a needs addressed list sounds like a good idea.  I'm not clear 
on how this would work in practice though.  Where would we keep it?  Would Jira 
tasks serve this purpose?  This seems like it could be applicable to all tasks 
in this feature branch (all feature branches really) so I don't think this is 
the appropriate place to discuss it.  Can we move it back to a discuss thread 
so you can clarify and expand on your idea?  I think it needs some more thought 
if we're to adopt it as a process.




> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503989#comment-16503989
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1019#discussion_r193573872
  
--- Diff: 
metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/PcapQueryServiceImpl.java
 ---
@@ -0,0 +1,73 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.rest.service.impl;
+
+import com.google.common.collect.Lists;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.metron.common.hadoop.SequenceFileIterable;
+import org.apache.metron.common.utils.timestamp.TimestampConverters;
+import org.apache.metron.pcap.PcapMerger;
+import org.apache.metron.pcap.filter.query.QueryPcapFilter;
+import org.apache.metron.pcap.mr.PcapJob;
+import org.apache.metron.rest.model.PcapsResponse;
+import org.apache.metron.rest.service.PcapQueryService;
+import org.springframework.stereotype.Service;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.util.List;
+
+@Service
+public class PcapQueryServiceImpl implements PcapQueryService {
+
+  @Override
+  public PcapsResponse query() {
--- End diff --

Noting as incomplete API definition implementation.


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503987#comment-16503987
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1019#discussion_r193571493
  
--- Diff: 
metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/impl/PcapQueryServiceImpl.java
 ---
@@ -0,0 +1,73 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.rest.service.impl;
+
+import com.google.common.collect.Lists;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.metron.common.hadoop.SequenceFileIterable;
+import org.apache.metron.common.utils.timestamp.TimestampConverters;
+import org.apache.metron.pcap.PcapMerger;
+import org.apache.metron.pcap.filter.query.QueryPcapFilter;
+import org.apache.metron.pcap.mr.PcapJob;
+import org.apache.metron.rest.model.PcapsResponse;
+import org.apache.metron.rest.service.PcapQueryService;
+import org.springframework.stereotype.Service;
+
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.util.List;
+
+@Service
+public class PcapQueryServiceImpl implements PcapQueryService {
+
+  @Override
+  public PcapsResponse query() {
+PcapsResponse response = new PcapsResponse();
+PcapJob pcapJob = new PcapJob();
+Configuration configuration = new Configuration();
+try {
+  SequenceFileIterable results = pcapJob.query(
+  new Path("/apps/metron/pcap"),
--- End diff --

These paths should be configurable


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-06-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503988#comment-16503988
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user mmiklavc commented on a diff in the pull request:

https://github.com/apache/metron/pull/1019#discussion_r193573780
  
--- Diff: 
metron-interface/metron-rest/src/main/java/org/apache/metron/rest/service/PcapQueryService.java
 ---
@@ -0,0 +1,25 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.metron.rest.service;
+
+import org.apache.metron.rest.model.PcapsResponse;
+
+public interface PcapQueryService {
+
+  PcapsResponse query();
--- End diff --

Noting this as an incomplete implementation.


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16479043#comment-16479043
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user JonZeolla commented on the issue:

https://github.com/apache/metron/pull/1019
  
Gotcha, GitHub mobile seems to collapse that information.  👍


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478895#comment-16478895
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user ottobackwards commented on the issue:

https://github.com/apache/metron/pull/1019
  
This pr is against the feature branch @JonZeolla so, it is not in play


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478415#comment-16478415
 ] 

ASF GitHub Bot commented on METRON-1555:


Github user JonZeolla commented on the issue:

https://github.com/apache/metron/pull/1019
  
Based on the temporary/example state of some of this PR would it make sense 
to explicitly keep this one out of the upcoming release, regardless of the 
review cycle?


> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue Type: Sub-task
>Reporter: Ryan Merriman
>Assignee: Ryan Merriman
>Priority: Major
>
> This task involves enabling REST to submit YARN or MR jobs.  We will likely 
> need to:
>  * update Maven dependencies to include YARN and MR libraries in the 
> classpath and resolve any version conflicts
>  * update REST start script to include properties required for YARN
>  * update the MPack for any additional setup work (create user HDFS directory 
> for example) and properties needed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (METRON-1555) Update REST to run YARN and MR jobs

2018-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/METRON-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478072#comment-16478072
 ] 

ASF GitHub Bot commented on METRON-1555:


GitHub user merrimanr opened a pull request:

https://github.com/apache/metron/pull/1019

METRON-1555: Update REST to run YARN and MR jobs

## Contributor Comments
This PR sets us up to run YARN and MR jobs inside our REST application.  
Changes include:
- addition of maven dependencies
- addition of -Dhdp.version parameter to the REST startup script
- MPack now supplies the hdp.version parameter
- MPack now sets up a "metron" service user HDFS directory needed for 
running MR jobs
- MPack now sets up a pcap HDFS directory 
- addition of a sample Pcap controller and service to demonstrate running 
MR jobs in REST

This should work in full dev as is.  All you need to do is generate some 
pcap data and hit the pcap endpoint included in this PR.  It should return raw 
binary pcap data.

It is assumed the PcapContoller, PcapService, PcapServiceImpl and 
PcapResponse classes will all be replaced or significantly changed in future 
tasks.  There are presented here temporarily as examples.

## Pull Request Checklist

Thank you for submitting a contribution to Apache Metron.  
Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  


In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?


### For code changes:
- [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
- [] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
  ```
  mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
  ```

- [ ] Have you written or updated unit tests and or integration tests to 
verify your changes?
- [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [x] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?

### For documentation related changes:
- [x] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:

  ```
  cd site-book
  mvn site
  ```

 Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/merrimanr/incubator-metron pcap-rest-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/metron/pull/1019.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1019


commit 4dcee51dd544e6c064dcc9dd1478c923a00c8281
Author: merrimanr 
Date:   2018-05-07T20:52:09Z

added simple pcap endpoint to rest

commit 22fe5e9ff3c167b42ebeb7a9f1000753a409aff1
Author: merrimanr 
Date:   2018-05-08T22:32:03Z

pcap query runs in rest




> Update REST to run YARN and MR jobs
> ---
>
> Key: METRON-1555
> URL: https://issues.apache.org/jira/browse/METRON-1555
> Project: Metron
>  Issue