[jira] [Commented] (METRON-1703) Make Core Profiler Components Serializable
[ https://issues.apache.org/jira/browse/METRON-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580502#comment-16580502 ] ASF GitHub Bot commented on METRON-1703: Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1145 Looks pretty straightforward. Why were the classes in org.apache.metron.profiler.hbase not updated to be Serializable? I'm sure this has been tested and is not needed, just curious. > Make Core Profiler Components Serializable > -- > > Key: METRON-1703 > URL: https://issues.apache.org/jira/browse/METRON-1703 > Project: Metron > Issue Type: Sub-task >Reporter: Nick Allen >Assignee: Nick Allen >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron issue #1145: METRON-1703 Make Core Profiler Components Serializable [...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1145 Looks pretty straightforward. Why were the classes in org.apache.metron.profiler.hbase not updated to be Serializable? I'm sure this has been tested and is not needed, just curious. ---
[jira] [Commented] (METRON-1737) Document Job cleanup
[ https://issues.apache.org/jira/browse/METRON-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580355#comment-16580355 ] ASF GitHub Bot commented on METRON-1737: GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1164 METRON-1737: Document Job cleanup ## Contributor Comments This PR adds documentation around the directory structure of Pcap query results and recommendations for managing their growth. It also includes a recommendation to configure a dedicated Pcap YARN queue and a warning for large date ranges in queries. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/merrimanr/incubator-metron METRON-1737 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1164.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1164 commit 842670a30a59b3498b3d626b4507499a611f3954 Author: merrimanr Date: 2018-08-14T19:52:20Z initial commit > Document Job cleanup > > > Key: METRON-1737 > URL: https://issues.apache.org/jira/browse/METRON-1737 > Project: Metron > Issue Type: Sub-task >Reporter: Ryan Merriman >Priority: Major > > Pcap query results are written to HDFS. Overtime more HDFS file space will > be used as queries are run. There is currently no automated cleanup feature > so we need to document how to do this in case a user needs to do it manually > or with a script. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1164: METRON-1737: Document Job cleanup
GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1164 METRON-1737: Document Job cleanup ## Contributor Comments This PR adds documentation around the directory structure of Pcap query results and recommendations for managing their growth. It also includes a recommendation to configure a dedicated Pcap YARN queue and a warning for large date ranges in queries. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/merrimanr/incubator-metron METRON-1737 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1164.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1164 commit 842670a30a59b3498b3d626b4507499a611f3954 Author: merrimanr Date: 2018-08-14T19:52:20Z initial commit ---
[jira] [Commented] (METRON-1735) Empty print status option causes NPE
[ https://issues.apache.org/jira/browse/METRON-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580320#comment-16580320 ] Michael Miklavcic commented on METRON-1735: --- Is this done? PR is closed and appears to be merged but the Jira is still open. > Empty print status option causes NPE > > > Key: METRON-1735 > URL: https://issues.apache.org/jira/browse/METRON-1735 > Project: Metron > Issue Type: Sub-task >Reporter: Ryan Merriman >Assignee: Ryan Merriman >Priority: Major > > REST does not set a print job status property causing a NPE in PcapJob > because the property is never added to the config. The PcapJob should > default to false. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron issue #1157: METRON-1732: Fix job status liveness bug and parallelize...
Github user mmiklavc commented on the issue: https://github.com/apache/metron/pull/1157 Note - I also added a small blurb about pcap page size to the README.md alongside the notes on setting the finalizer threads. This was missed previously. ---
[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing
[ https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580247#comment-16580247 ] ASF GitHub Bot commented on METRON-1732: Github user mmiklavc commented on the issue: https://github.com/apache/metron/pull/1157 Note - I also added a small blurb about pcap page size to the README.md alongside the notes on setting the finalizer threads. This was missed previously. > Fix job status liveness bug and parallelize finalizer file writing > -- > > Key: METRON-1732 > URL: https://issues.apache.org/jira/browse/METRON-1732 > Project: Metron > Issue Type: Sub-task >Reporter: Michael Miklavcic >Assignee: Michael Miklavcic >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1714) Create RPM Packaging for the Batch Profiler
[ https://issues.apache.org/jira/browse/METRON-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580245#comment-16580245 ] ASF GitHub Bot commented on METRON-1714: GitHub user nickwallen opened a pull request: https://github.com/apache/metron/pull/1163 METRON-1714 Create RPM Packaging for the Batch Profiler This creates an RPM that makes it simple to install the Batch Profiler. There are now two separate RPM packages created for the Profiler; one for Storm and another for Spark. * metron-profiler-storm-*.rpm * metron-profiler-spark-*.rpm This is a pull request against the `METRON-1699-create-batch-profiler` feature branch. This is dependent on #1161 . By filtering on the last commit, this PR can be reviewed before the others are reviewed and merged. ### Testing 1. Build Metron. ``` mvn clean package -DskipTests -T2C ``` 1. Ensure that the RPMs were built. ``` $ cd metron-deployment/packaging/docker/rpm-docker/target $ find ./ -name "*.rpm" ... .//RPMS/noarch/metron-profiler-storm-0.5.1-201808141752.noarch.rpm ... .//RPMS/noarch/metron-profiler-spark-0.5.1-201808141752.noarch.rpm ... ``` 1. Stand-up a CentOS VM for testing the RPM. ``` vagrant init centos/6 vagrant up ``` 1. Copy both Profiler RPMs to the VM. ``` vagrant scp metron-deployment/packaging/docker/rpm-docker/RPMS/noarch/metron-profiler-spark-*.rpm default:/tmp vagrant scp metron-deployment/packaging/docker/rpm-docker/RPMS/noarch/metron-profiler-storm-*.rpm default:/tmp ``` 1. Install the RPM on the VM. ``` [vagrant@localhost ~]$ sudo su - [root@localhost ~]# rpm -ivh /tmp/metron-profiler-spark-0.5.1-201808141752.noarch.rpm Preparing...### [100%] 1:metron-profiler-spark ### [100%] ``` ``` [root@localhost ~]# rpm -ivh /tmp/metron-profiler-storm-0.5.1-201808141752.noarch.rpm Preparing...### [100%] 1:metron-profiler-storm ### [100%] ``` 1. Ensure that the following files have been installed in `${METRON_HOME}`. ``` /usr/metron/ └── 0.5.1 ├── bin │ ├── start_batch_profiler.sh │ └── start_profiler_topology.sh ├── config │ ├── batch-profiler.properties │ └── profiler.properties ├── flux │ └── profiler │ └── remote.yaml └── lib ├── metron-profiler-0.5.1-uber.jar └── metron-profiler-spark-0.5.1.jar ``` ## Pull Request Checklist - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? You can merge this pull request into a Git repository by running: $ git pull https://github.com/nickwallen/metron METRON-1714 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1163.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1163 commit 6ce28594659928a8c87c57e22e1ab00d798d Author: Nick Allen Date: 2018-07-10T14:08:48Z METRON-1703 Make Core Profiler Components Serializable commit 0051359cbb277881de896526345bb4fce1d5139c Author: Nick Allen Date: 2018-07-10T19:42:19Z METRON-1704 Message Timestamp Logic Should be Shared commit 2413726bdf96221ec775a9c8de524e3ec92148b7 Author: Nick Allen Date: 2018-07-27T17:20:15Z METRON-1706: HbaseClient.mutate should return the number of mutations commit 21980ca764b98ddb96c4c8732e0ef7a6c5ea2c56 Author: Nick Allen Date: 2018-07-24T18:02:36Z METRON-1705 Create ProfilePeriod Using Period ID commit be15126419a2862864a7acd67349281b086f52cf Author: Nick Allen Date:
[GitHub] metron pull request #1163: METRON-1714 Create RPM Packaging for the Batch Pr...
GitHub user nickwallen opened a pull request: https://github.com/apache/metron/pull/1163 METRON-1714 Create RPM Packaging for the Batch Profiler This creates an RPM that makes it simple to install the Batch Profiler. There are now two separate RPM packages created for the Profiler; one for Storm and another for Spark. * metron-profiler-storm-*.rpm * metron-profiler-spark-*.rpm This is a pull request against the `METRON-1699-create-batch-profiler` feature branch. This is dependent on #1161 . By filtering on the last commit, this PR can be reviewed before the others are reviewed and merged. ### Testing 1. Build Metron. ``` mvn clean package -DskipTests -T2C ``` 1. Ensure that the RPMs were built. ``` $ cd metron-deployment/packaging/docker/rpm-docker/target $ find ./ -name "*.rpm" ... .//RPMS/noarch/metron-profiler-storm-0.5.1-201808141752.noarch.rpm ... .//RPMS/noarch/metron-profiler-spark-0.5.1-201808141752.noarch.rpm ... ``` 1. Stand-up a CentOS VM for testing the RPM. ``` vagrant init centos/6 vagrant up ``` 1. Copy both Profiler RPMs to the VM. ``` vagrant scp metron-deployment/packaging/docker/rpm-docker/RPMS/noarch/metron-profiler-spark-*.rpm default:/tmp vagrant scp metron-deployment/packaging/docker/rpm-docker/RPMS/noarch/metron-profiler-storm-*.rpm default:/tmp ``` 1. Install the RPM on the VM. ``` [vagrant@localhost ~]$ sudo su - [root@localhost ~]# rpm -ivh /tmp/metron-profiler-spark-0.5.1-201808141752.noarch.rpm Preparing...### [100%] 1:metron-profiler-spark ### [100%] ``` ``` [root@localhost ~]# rpm -ivh /tmp/metron-profiler-storm-0.5.1-201808141752.noarch.rpm Preparing...### [100%] 1:metron-profiler-storm ### [100%] ``` 1. Ensure that the following files have been installed in `${METRON_HOME}`. ``` /usr/metron/ âââ 0.5.1 âââ bin â  âââ start_batch_profiler.sh â  âââ start_profiler_topology.sh âââ config â  âââ batch-profiler.properties â  âââ profiler.properties âââ flux â  âââ profiler â  âââ remote.yaml âââ lib âââ metron-profiler-0.5.1-uber.jar âââ metron-profiler-spark-0.5.1.jar ``` ## Pull Request Checklist - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? You can merge this pull request into a Git repository by running: $ git pull https://github.com/nickwallen/metron METRON-1714 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1163.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1163 commit 6ce28594659928a8c87c57e22e1ab00d798d Author: Nick Allen Date: 2018-07-10T14:08:48Z METRON-1703 Make Core Profiler Components Serializable commit 0051359cbb277881de896526345bb4fce1d5139c Author: Nick Allen Date: 2018-07-10T19:42:19Z METRON-1704 Message Timestamp Logic Should be Shared commit 2413726bdf96221ec775a9c8de524e3ec92148b7 Author: Nick Allen Date: 2018-07-27T17:20:15Z METRON-1706: HbaseClient.mutate should return the number of mutations commit 21980ca764b98ddb96c4c8732e0ef7a6c5ea2c56 Author: Nick Allen Date: 2018-07-24T18:02:36Z METRON-1705 Create ProfilePeriod Using Period ID commit be15126419a2862864a7acd67349281b086f52cf Author: Nick Allen Date: 2018-07-31T19:26:20Z METRON-1707 Port Profiler to Spark commit c410e412c50f4510f8674cd4fa5d4481f28a4a13 Author: Nick Allen Date:
[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing
[ https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580241#comment-16580241 ] ASF GitHub Bot commented on METRON-1732: Github user mmiklavc commented on a diff in the pull request: https://github.com/apache/metron/pull/1157#discussion_r210059160 --- Diff: metron-platform/metron-pcap/src/main/java/org/apache/metron/pcap/mr/PcapJob.java --- @@ -307,8 +307,11 @@ public void setCompleteCheckInterval(long interval) { } return this; } -mrJob.submit(); -jobStatus.withState(State.SUBMITTED).withDescription("Job submitted").withJobId(mrJob.getJobID().toString()); +synchronized (this) { --- End diff -- fyi, turns out I was right first time around. Synchronization is necessary for visibility in the timer thread that is started after these modifications. I've updated the comments in code to describe this. > Fix job status liveness bug and parallelize finalizer file writing > -- > > Key: METRON-1732 > URL: https://issues.apache.org/jira/browse/METRON-1732 > Project: Metron > Issue Type: Sub-task >Reporter: Michael Miklavcic >Assignee: Michael Miklavcic >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1157: METRON-1732: Fix job status liveness bug and para...
Github user mmiklavc commented on a diff in the pull request: https://github.com/apache/metron/pull/1157#discussion_r210059160 --- Diff: metron-platform/metron-pcap/src/main/java/org/apache/metron/pcap/mr/PcapJob.java --- @@ -307,8 +307,11 @@ public void setCompleteCheckInterval(long interval) { } return this; } -mrJob.submit(); -jobStatus.withState(State.SUBMITTED).withDescription("Job submitted").withJobId(mrJob.getJobID().toString()); +synchronized (this) { --- End diff -- fyi, turns out I was right first time around. Synchronization is necessary for visibility in the timer thread that is started after these modifications. I've updated the comments in code to describe this. ---
[jira] [Commented] (METRON-1732) Fix job status liveness bug and parallelize finalizer file writing
[ https://issues.apache.org/jira/browse/METRON-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580238#comment-16580238 ] ASF GitHub Bot commented on METRON-1732: Github user mmiklavc commented on the issue: https://github.com/apache/metron/pull/1157 **Testing** Test plan pulled from here - https://github.com/apache/metron/pull/1081#issuecomment-400556832 Get PCAP data into Metron: 1. Install and setup pycapa (this has been updated in master recently) - https://github.com/apache/metron/blob/master/metron-sensors/pycapa/README.md#centos-6 2. (if using singlenode vagrant) Kill the enrichment, profiler, indexing, and sensor topologies via `for i in bro enrichment random_access_indexing batch_indexing yaf snort;do storm kill $i;done` 3. Start the pcap topology via $METRON_HOME/bin/start_pcap_topology.sh 4. Start the pycapa packet capture producer on eth1 via /usr/bin/pycapa --producer --topic pcap -i eth1 -k node1:6667 5. Watch the topology in the Storm UI and kill the packet capture utility from before, when the number of packets ingested is over 3k. 6. Ensure that at at least 3 files exist on HDFS by running hadoop fs -ls /apps/metron/pcap 7. Choose a file (denoted by $FILE) and dump a few of the contents using the pcap_inspector utility via $METRON_HOME//bin/pcap_inspector.sh -i $FILE -n 5 8. Choose one of the lines and note the protocol. 9. Note that when you run the commands below, the resulting file will be placed in the execution directory where you kicked off the job from. ### Fixed filter 1. Run a fixed filter query by executing the following command with the values noted above (match your start_time format to the date format provided - default is to use millis since epoch) 2. `$METRON_HOME/bin/pcap_query.sh fixed -st -df "MMdd" -p -rpf 500` 3. Verify the MR job finishes successfully. Upon completion, you should see multiple files named with relatively current datestamps in your current directory, e.g. pcap-data-20160617160549737+.pcap 4. Copy the files to your local machine and verify you can them it in Wireshark. I chose a middle file and the last file. The middle file should have 500 records (per the records_per_file option), and the last one will likely have a number of records <= 500. ### Query filter 1. Run a Stellar query filter query by executing a command similar to the following, with the values noted above (match your start_time format to the date format provided - default is to use millis since epoch) 2. `$METRON_HOME/bin/pcap_query.sh query -st "20160617" -df "MMdd" -query "protocol == '6'" -rpf 500` 3. Verify the MR job finishes successfully. Upon completion, you should see multiple files named with relatively current datestamps in your current directory, e.g. pcap-data-20160617160549737+.pcap 4. Copy the files to your local machine and verify you can them it in Wireshark. I chose a middle file and the last file. The middle file should have 500 records (per the records_per_file option), and the last one will likely have a number of records <= 500. Also run riffs on the fixed query via the Metron Alerts UI PCAP query panel. > Fix job status liveness bug and parallelize finalizer file writing > -- > > Key: METRON-1732 > URL: https://issues.apache.org/jira/browse/METRON-1732 > Project: Metron > Issue Type: Sub-task >Reporter: Michael Miklavcic >Assignee: Michael Miklavcic >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron issue #1157: METRON-1732: Fix job status liveness bug and parallelize...
Github user mmiklavc commented on the issue: https://github.com/apache/metron/pull/1157 **Testing** Test plan pulled from here - https://github.com/apache/metron/pull/1081#issuecomment-400556832 Get PCAP data into Metron: 1. Install and setup pycapa (this has been updated in master recently) - https://github.com/apache/metron/blob/master/metron-sensors/pycapa/README.md#centos-6 2. (if using singlenode vagrant) Kill the enrichment, profiler, indexing, and sensor topologies via `for i in bro enrichment random_access_indexing batch_indexing yaf snort;do storm kill $i;done` 3. Start the pcap topology via $METRON_HOME/bin/start_pcap_topology.sh 4. Start the pycapa packet capture producer on eth1 via /usr/bin/pycapa --producer --topic pcap -i eth1 -k node1:6667 5. Watch the topology in the Storm UI and kill the packet capture utility from before, when the number of packets ingested is over 3k. 6. Ensure that at at least 3 files exist on HDFS by running hadoop fs -ls /apps/metron/pcap 7. Choose a file (denoted by $FILE) and dump a few of the contents using the pcap_inspector utility via $METRON_HOME//bin/pcap_inspector.sh -i $FILE -n 5 8. Choose one of the lines and note the protocol. 9. Note that when you run the commands below, the resulting file will be placed in the execution directory where you kicked off the job from. ### Fixed filter 1. Run a fixed filter query by executing the following command with the values noted above (match your start_time format to the date format provided - default is to use millis since epoch) 2. `$METRON_HOME/bin/pcap_query.sh fixed -st -df "MMdd" -p -rpf 500` 3. Verify the MR job finishes successfully. Upon completion, you should see multiple files named with relatively current datestamps in your current directory, e.g. pcap-data-20160617160549737+.pcap 4. Copy the files to your local machine and verify you can them it in Wireshark. I chose a middle file and the last file. The middle file should have 500 records (per the records_per_file option), and the last one will likely have a number of records <= 500. ### Query filter 1. Run a Stellar query filter query by executing a command similar to the following, with the values noted above (match your start_time format to the date format provided - default is to use millis since epoch) 2. `$METRON_HOME/bin/pcap_query.sh query -st "20160617" -df "MMdd" -query "protocol == '6'" -rpf 500` 3. Verify the MR job finishes successfully. Upon completion, you should see multiple files named with relatively current datestamps in your current directory, e.g. pcap-data-20160617160549737+.pcap 4. Copy the files to your local machine and verify you can them it in Wireshark. I chose a middle file and the last file. The middle file should have 500 records (per the records_per_file option), and the last one will likely have a number of records <= 500. Also run riffs on the fixed query via the Metron Alerts UI PCAP query panel. ---
[jira] [Assigned] (METRON-1714) Create RPM Packaging for the Batch Profiler
[ https://issues.apache.org/jira/browse/METRON-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Allen reassigned METRON-1714: -- Assignee: Nick Allen > Create RPM Packaging for the Batch Profiler > --- > > Key: METRON-1714 > URL: https://issues.apache.org/jira/browse/METRON-1714 > Project: Metron > Issue Type: Sub-task >Reporter: Nick Allen >Assignee: Nick Allen >Priority: Major > > Create RPM packaging for the Batch Profiler -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron issue #1157: METRON-1732: Fix job status liveness bug and parallelize...
Github user mmiklavc commented on the issue: https://github.com/apache/metron/pull/1157 @nickwallen I've addressed your review comments. Let me know what you think. ---
[jira] [Commented] (METRON-1735) Empty print status option causes NPE
[ https://issues.apache.org/jira/browse/METRON-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580179#comment-16580179 ] ASF GitHub Bot commented on METRON-1735: Github user merrimanr closed the pull request at: https://github.com/apache/metron/pull/1160 > Empty print status option causes NPE > > > Key: METRON-1735 > URL: https://issues.apache.org/jira/browse/METRON-1735 > Project: Metron > Issue Type: Sub-task >Reporter: Ryan Merriman >Assignee: Ryan Merriman >Priority: Major > > REST does not set a print job status property causing a NPE in PcapJob > because the property is never added to the config. The PcapJob should > default to false. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1160: METRON-1735: Empty print status option causes NPE
Github user merrimanr closed the pull request at: https://github.com/apache/metron/pull/1160 ---
[jira] [Commented] (METRON-1696) Pcap parser fails to write pacap sequence file to hdfs on kerberized cluster
[ https://issues.apache.org/jira/browse/METRON-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580168#comment-16580168 ] ASF GitHub Bot commented on METRON-1696: Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1134 The REST component in Ambari is currently responsible for setting up Pcap HDFS directories. It looks like this PR duplicates some of that. I think we will have a dedicated Pcap Ambari component at some point so I don't think it matters that much where this setup logic lives in the meantime. However, this PR doesn't line up with the feature branch because it only creates one directory where the Pcap query feature requires 3. Also the property names in Ambari need to match REST. I would recommend reviewing how this is currently implemented in REST. Most of the work was done in https://github.com/apache/metron/pull/1124. Whether this ends up in the REST or parsers component doesn't matter to me as long as it functions. Also, unless we can adjust this PR to only include a couple minor changes I would strongly suggest this be tested end to end. I doubt it works in it's current state. > Pcap parser fails to write pacap sequence file to hdfs on kerberized cluster > - > > Key: METRON-1696 > URL: https://issues.apache.org/jira/browse/METRON-1696 > Project: Metron > Issue Type: Sub-task >Reporter: Mohan >Assignee: Mohan >Priority: Major > > pcap parser fails to write the pcap sequence files to hdfs directory due to > insufficient privileges to hdfs folder for 'metron' user > {code:java} > 2018-07-25 10:15:50.035 o.a.m.s.p.HDFSWriterCallback > Thread-9-kafkaSpout-executor[3 3] [ERROR] Permission denied: user=metron, > access=WRITE, > inode="/apps/metron/pcap/pcap_pcap_1532513746365022000_0_pcap-20-1532414055":hdfs:hdfs:drwxr-xr-x > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:325) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:246) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1950) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1934) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1917) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2767) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2702) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2586) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:736) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:409) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron issue #1134: METRON-1696: Create the HDFS directory for pcap sequence...
Github user merrimanr commented on the issue: https://github.com/apache/metron/pull/1134 The REST component in Ambari is currently responsible for setting up Pcap HDFS directories. It looks like this PR duplicates some of that. I think we will have a dedicated Pcap Ambari component at some point so I don't think it matters that much where this setup logic lives in the meantime. However, this PR doesn't line up with the feature branch because it only creates one directory where the Pcap query feature requires 3. Also the property names in Ambari need to match REST. I would recommend reviewing how this is currently implemented in REST. Most of the work was done in https://github.com/apache/metron/pull/1124. Whether this ends up in the REST or parsers component doesn't matter to me as long as it functions. Also, unless we can adjust this PR to only include a couple minor changes I would strongly suggest this be tested end to end. I doubt it works in it's current state. ---
[jira] [Commented] (METRON-1707) Port Profiler to Spark
[ https://issues.apache.org/jira/browse/METRON-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580080#comment-16580080 ] ASF GitHub Bot commented on METRON-1707: Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/1150#discussion_r210030412 --- Diff: metron-analytics/metron-profiler-spark/src/main/java/org/apache/metron/profiler/spark/function/ProfileBuilderFunction.java --- @@ -0,0 +1,107 @@ +/* + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * + */ +package org.apache.metron.profiler.spark.function; + +import org.apache.metron.profiler.DefaultMessageDistributor; +import org.apache.metron.profiler.MessageDistributor; +import org.apache.metron.profiler.MessageRoute; +import org.apache.metron.profiler.ProfileMeasurement; +import org.apache.metron.profiler.spark.ProfileMeasurementAdapter; +import org.apache.metron.stellar.dsl.Context; +import org.apache.spark.api.java.function.MapGroupsFunction; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Properties; +import java.util.concurrent.TimeUnit; +import java.util.stream.Collectors; +import java.util.stream.Stream; +import java.util.stream.StreamSupport; + +import static java.util.Comparator.comparing; +import static org.apache.metron.profiler.spark.BatchProfilerConfig.PERIOD_DURATION; +import static org.apache.metron.profiler.spark.BatchProfilerConfig.PERIOD_DURATION_UNITS; + +/** + * The function responsible for building profiles in Spark. + */ +public class ProfileBuilderFunction implements MapGroupsFunction { + + protected static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + + private long periodDurationMillis; + private Map globals; + + public ProfileBuilderFunction(Properties properties, Map globals) { +TimeUnit periodDurationUnits = TimeUnit.valueOf(PERIOD_DURATION_UNITS.get(properties, String.class)); +int periodDuration = PERIOD_DURATION.get(properties, Integer.class); +this.periodDurationMillis = periodDurationUnits.toMillis(periodDuration); +this.globals = globals; + } + + /** + * Build a profile from a set of message routes. + * + * This assumes that all of the necessary routes have been provided + * + * @param group The group identifier. + * @param iterator The message routes. + * @return + */ + @Override + public ProfileMeasurementAdapter call(String group, Iterator iterator) throws Exception { +// create the distributor; some settings are unnecessary because it is cleaned-up immediately after processing the batch +int maxRoutes = Integer.MAX_VALUE; +long profileTTLMillis = Long.MAX_VALUE; +MessageDistributor distributor = new DefaultMessageDistributor(periodDurationMillis, profileTTLMillis, maxRoutes); +Context context = TaskUtils.getContext(globals); + +// sort the messages/routes +List routes = toStream(iterator) +.sorted(comparing(rt -> rt.getTimestamp())) +.collect(Collectors.toList()); +LOG.debug("Building a profile for group '{}' from {} message(s)", group, routes.size()); + +// apply each message/route to build the profile +for(MessageRoute route: routes) { + distributor.distribute(route, context); +} --- End diff -- So assuming timestamp ordering doesn't matter, I am also trying to understand whether a groupByKey vs reduceByKey is applicable with the Dataset API. The Catalyst engine is doing some optimization under-the-hood. At the very least, I might add the ability to log an explain plan, but leave the effort to optimize this as a follow-on
[GitHub] metron pull request #1150: METRON-1707 Port Profiler to Spark [Feature Branc...
Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/1150#discussion_r210030412 --- Diff: metron-analytics/metron-profiler-spark/src/main/java/org/apache/metron/profiler/spark/function/ProfileBuilderFunction.java --- @@ -0,0 +1,107 @@ +/* + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * + */ +package org.apache.metron.profiler.spark.function; + +import org.apache.metron.profiler.DefaultMessageDistributor; +import org.apache.metron.profiler.MessageDistributor; +import org.apache.metron.profiler.MessageRoute; +import org.apache.metron.profiler.ProfileMeasurement; +import org.apache.metron.profiler.spark.ProfileMeasurementAdapter; +import org.apache.metron.stellar.dsl.Context; +import org.apache.spark.api.java.function.MapGroupsFunction; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Properties; +import java.util.concurrent.TimeUnit; +import java.util.stream.Collectors; +import java.util.stream.Stream; +import java.util.stream.StreamSupport; + +import static java.util.Comparator.comparing; +import static org.apache.metron.profiler.spark.BatchProfilerConfig.PERIOD_DURATION; +import static org.apache.metron.profiler.spark.BatchProfilerConfig.PERIOD_DURATION_UNITS; + +/** + * The function responsible for building profiles in Spark. + */ +public class ProfileBuilderFunction implements MapGroupsFunction { + + protected static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + + private long periodDurationMillis; + private Map globals; + + public ProfileBuilderFunction(Properties properties, Map globals) { +TimeUnit periodDurationUnits = TimeUnit.valueOf(PERIOD_DURATION_UNITS.get(properties, String.class)); +int periodDuration = PERIOD_DURATION.get(properties, Integer.class); +this.periodDurationMillis = periodDurationUnits.toMillis(periodDuration); +this.globals = globals; + } + + /** + * Build a profile from a set of message routes. + * + * This assumes that all of the necessary routes have been provided + * + * @param group The group identifier. + * @param iterator The message routes. + * @return + */ + @Override + public ProfileMeasurementAdapter call(String group, Iterator iterator) throws Exception { +// create the distributor; some settings are unnecessary because it is cleaned-up immediately after processing the batch +int maxRoutes = Integer.MAX_VALUE; +long profileTTLMillis = Long.MAX_VALUE; +MessageDistributor distributor = new DefaultMessageDistributor(periodDurationMillis, profileTTLMillis, maxRoutes); +Context context = TaskUtils.getContext(globals); + +// sort the messages/routes +List routes = toStream(iterator) +.sorted(comparing(rt -> rt.getTimestamp())) +.collect(Collectors.toList()); +LOG.debug("Building a profile for group '{}' from {} message(s)", group, routes.size()); + +// apply each message/route to build the profile +for(MessageRoute route: routes) { + distributor.distribute(route, context); +} --- End diff -- So assuming timestamp ordering doesn't matter, I am also trying to understand whether a groupByKey vs reduceByKey is applicable with the Dataset API. The Catalyst engine is doing some optimization under-the-hood. At the very least, I might add the ability to log an explain plan, but leave the effort to optimize this as a follow-on PR. ---
[jira] [Commented] (METRON-1735) Empty print status option causes NPE
[ https://issues.apache.org/jira/browse/METRON-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579894#comment-16579894 ] ASF GitHub Bot commented on METRON-1735: Github user justinleet commented on the issue: https://github.com/apache/metron/pull/1160 +1 by inspection. Thanks for catching this, it's a good fix. > Empty print status option causes NPE > > > Key: METRON-1735 > URL: https://issues.apache.org/jira/browse/METRON-1735 > Project: Metron > Issue Type: Sub-task >Reporter: Ryan Merriman >Assignee: Ryan Merriman >Priority: Major > > REST does not set a print job status property causing a NPE in PcapJob > because the property is never added to the config. The PcapJob should > default to false. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron issue #1160: METRON-1735: Empty print status option causes NPE
Github user justinleet commented on the issue: https://github.com/apache/metron/pull/1160 +1 by inspection. Thanks for catching this, it's a good fix. ---
[GitHub] metron issue #1150: METRON-1707 Port Profiler to Spark [Feature Branch]
Github user nickwallen commented on the issue: https://github.com/apache/metron/pull/1150 I just refactored `BatchProfiler` hopefully to make it simpler to grok. The functionality was not changed. I wanted to refactor prior to additional work on @simonellistonball 's `reduceByKey` suggestion. ---
[jira] [Commented] (METRON-1707) Port Profiler to Spark
[ https://issues.apache.org/jira/browse/METRON-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579886#comment-16579886 ] ASF GitHub Bot commented on METRON-1707: Github user nickwallen commented on the issue: https://github.com/apache/metron/pull/1150 I just refactored `BatchProfiler` hopefully to make it simpler to grok. The functionality was not changed. I wanted to refactor prior to additional work on @simonellistonball 's `reduceByKey` suggestion. > Port Profiler to Spark > -- > > Key: METRON-1707 > URL: https://issues.apache.org/jira/browse/METRON-1707 > Project: Metron > Issue Type: Sub-task >Reporter: Nick Allen >Assignee: Nick Allen >Priority: Major > > Create a port of the Profiler that runs in Spark. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (METRON-1696) Pcap parser fails to write pacap sequence file to hdfs on kerberized cluster
[ https://issues.apache.org/jira/browse/METRON-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohan updated METRON-1696: -- Issue Type: Sub-task (was: Bug) Parent: METRON-1554 > Pcap parser fails to write pacap sequence file to hdfs on kerberized cluster > - > > Key: METRON-1696 > URL: https://issues.apache.org/jira/browse/METRON-1696 > Project: Metron > Issue Type: Sub-task >Reporter: Mohan >Assignee: Mohan >Priority: Major > > pcap parser fails to write the pcap sequence files to hdfs directory due to > insufficient privileges to hdfs folder for 'metron' user > {code:java} > 2018-07-25 10:15:50.035 o.a.m.s.p.HDFSWriterCallback > Thread-9-kafkaSpout-executor[3 3] [ERROR] Permission denied: user=metron, > access=WRITE, > inode="/apps/metron/pcap/pcap_pcap_1532513746365022000_0_pcap-20-1532414055":hdfs:hdfs:drwxr-xr-x > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:325) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:246) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1950) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1934) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1917) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2767) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2702) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2586) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:736) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:409) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (METRON-1733) PCAP UI - PCAP queries don't work on Safari
[ https://issues.apache.org/jira/browse/METRON-1733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579625#comment-16579625 ] ASF GitHub Bot commented on METRON-1733: Github user sardell closed the pull request at: https://github.com/apache/metron/pull/1158 > PCAP UI - PCAP queries don't work on Safari > --- > > Key: METRON-1733 > URL: https://issues.apache.org/jira/browse/METRON-1733 > Project: Metron > Issue Type: Sub-task >Reporter: Shane Ardell >Assignee: Shane Ardell >Priority: Major > > On Safari, PCAP queries fail with a 500 internal server error. No issues seen > with Chrome or Firefox. After digging into the search request, it looks like > the values for the startTime and endTime are 'NaN'. It looks like Safari > cannot parse the format of the time we are passing to the getDate() funciton. > For more on this issue: > https://stackoverflow.com/questions/21883699/safari-javascript-date-nan-issue--mm-dd-hhmmss -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron pull request #1158: METRON-1733: PCAP UI - PCAP queries don't work on...
Github user sardell closed the pull request at: https://github.com/apache/metron/pull/1158 ---
[jira] [Commented] (METRON-1696) Pcap parser fails to write pacap sequence file to hdfs on kerberized cluster
[ https://issues.apache.org/jira/browse/METRON-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579567#comment-16579567 ] ASF GitHub Bot commented on METRON-1696: Github user MohanDV commented on the issue: https://github.com/apache/metron/pull/1134 Non Kerberized pcap topology runs under 'storm' which requires write under the 'hadoop' group so we should provide mode 0775 when the cluster is un secure and mode 0755 when the cluster is secure as the topology runs under 'metron' user. > Pcap parser fails to write pacap sequence file to hdfs on kerberized cluster > - > > Key: METRON-1696 > URL: https://issues.apache.org/jira/browse/METRON-1696 > Project: Metron > Issue Type: Bug >Reporter: Mohan >Assignee: Mohan >Priority: Major > > pcap parser fails to write the pcap sequence files to hdfs directory due to > insufficient privileges to hdfs folder for 'metron' user > {code:java} > 2018-07-25 10:15:50.035 o.a.m.s.p.HDFSWriterCallback > Thread-9-kafkaSpout-executor[3 3] [ERROR] Permission denied: user=metron, > access=WRITE, > inode="/apps/metron/pcap/pcap_pcap_1532513746365022000_0_pcap-20-1532414055":hdfs:hdfs:drwxr-xr-x > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:325) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:246) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1950) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1934) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1917) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2767) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2702) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2586) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:736) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:409) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] metron issue #1134: METRON-1696: Create the HDFS directory for pcap sequence...
Github user MohanDV commented on the issue: https://github.com/apache/metron/pull/1134 Non Kerberized pcap topology runs under 'storm' which requires write under the 'hadoop' group so we should provide mode 0775 when the cluster is un secure and mode 0755 when the cluster is secure as the topology runs under 'metron' user. ---