[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file
[ https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807290#comment-16807290 ] Kunal Khatua commented on DRILL-7032: - [~cgivre] does this require any additional Documentation beyond a mention in the release notes? (cc: [~bbevens]) > Ignore corrupt rows in a PCAP file > -- > > Key: DRILL-7032 > URL: https://issues.apache.org/jira/browse/DRILL-7032 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: OS: Ubuntu 18.4 > Drill version: 1.15.0 > Java(TM) SE Runtime Environment (build 1.8.0_191-b12) >Reporter: Giovanni Conte >Assignee: Charles Givre >Priority: Major > Labels: ready-to-commit > Fix For: 1.16.0 > > > Would be useful for Drill to have some ability to ignore corrupt rows in a > PCAP file instead of trow the java exception. > This is because there are many pcap files with corrupted lines and this > funcionality will avoid to do a pre-fixing of the packet-captures (example > attached file). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file
[ https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806444#comment-16806444 ] ASF GitHub Bot commented on DRILL-7032: --- asfgit commented on pull request #1637: DRILL-7032: Ignore corrupt rows in a PCAP file URL: https://github.com/apache/drill/pull/1637 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Ignore corrupt rows in a PCAP file > -- > > Key: DRILL-7032 > URL: https://issues.apache.org/jira/browse/DRILL-7032 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: OS: Ubuntu 18.4 > Drill version: 1.15.0 > Java(TM) SE Runtime Environment (build 1.8.0_191-b12) >Reporter: Giovanni Conte >Assignee: Charles Givre >Priority: Major > Labels: ready-to-commit > Fix For: 1.16.0 > > > Would be useful for Drill to have some ability to ignore corrupt rows in a > PCAP file instead of trow the java exception. > This is because there are many pcap files with corrupted lines and this > funcionality will avoid to do a pre-fixing of the packet-captures (example > attached file). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file
[ https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800813#comment-16800813 ] ASF GitHub Bot commented on DRILL-7032: --- cgivre commented on issue #1637: DRILL-7032: Ignore corrupt rows in a PCAP file URL: https://github.com/apache/drill/pull/1637#issuecomment-476254011 Commits squashed. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Ignore corrupt rows in a PCAP file > -- > > Key: DRILL-7032 > URL: https://issues.apache.org/jira/browse/DRILL-7032 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: OS: Ubuntu 18.4 > Drill version: 1.15.0 > Java(TM) SE Runtime Environment (build 1.8.0_191-b12) >Reporter: Giovanni Conte >Assignee: Charles Givre >Priority: Major > Labels: ready-to-commit > Fix For: 1.16.0 > > > Would be useful for Drill to have some ability to ignore corrupt rows in a > PCAP file instead of trow the java exception. > This is because there are many pcap files with corrupted lines and this > funcionality will avoid to do a pre-fixing of the packet-captures (example > attached file). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file
[ https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800807#comment-16800807 ] ASF GitHub Bot commented on DRILL-7032: --- arina-ielchiieva commented on issue #1637: DRILL-7032: Ignore corrupt rows in a PCAP file URL: https://github.com/apache/drill/pull/1637#issuecomment-476251224 +1, LGTM. @cgivre please squash the commits. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Ignore corrupt rows in a PCAP file > -- > > Key: DRILL-7032 > URL: https://issues.apache.org/jira/browse/DRILL-7032 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: OS: Ubuntu 18.4 > Drill version: 1.15.0 > Java(TM) SE Runtime Environment (build 1.8.0_191-b12) >Reporter: Giovanni Conte >Assignee: Charles Givre >Priority: Major > Fix For: 1.16.0 > > > Would be useful for Drill to have some ability to ignore corrupt rows in a > PCAP file instead of trow the java exception. > This is because there are many pcap files with corrupted lines and this > funcionality will avoid to do a pre-fixing of the packet-captures (example > attached file). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file
[ https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800779#comment-16800779 ] ASF GitHub Bot commented on DRILL-7032: --- cgivre commented on pull request #1637: DRILL-7032: Ignore corrupt rows in a PCAP file URL: https://github.com/apache/drill/pull/1637#discussion_r268688621 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/decoder/Packet.java ## @@ -324,7 +333,12 @@ public int getDst_port() { byte[] data = null; if (packetLength >= payloadDataStart) { data = new byte[packetLength - payloadDataStart]; - System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, data.length); + try { +System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, data.length); + } catch (Exception e) { +isCorrupt = true; +logger.info("Error while parsing PCAP data: ", e.getMessage()); Review comment: Thanks @arina-ielchiieva. Fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Ignore corrupt rows in a PCAP file > -- > > Key: DRILL-7032 > URL: https://issues.apache.org/jira/browse/DRILL-7032 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: OS: Ubuntu 18.4 > Drill version: 1.15.0 > Java(TM) SE Runtime Environment (build 1.8.0_191-b12) >Reporter: Giovanni Conte >Assignee: Charles Givre >Priority: Major > Fix For: 1.16.0 > > > Would be useful for Drill to have some ability to ignore corrupt rows in a > PCAP file instead of trow the java exception. > This is because there are many pcap files with corrupted lines and this > funcionality will avoid to do a pre-fixing of the packet-captures (example > attached file). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file
[ https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800459#comment-16800459 ] ASF GitHub Bot commented on DRILL-7032: --- arina-ielchiieva commented on pull request #1637: DRILL-7032: Ignore corrupt rows in a PCAP file URL: https://github.com/apache/drill/pull/1637#discussion_r268515281 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/decoder/Packet.java ## @@ -324,7 +333,12 @@ public int getDst_port() { byte[] data = null; if (packetLength >= payloadDataStart) { data = new byte[packetLength - payloadDataStart]; - System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, data.length); + try { +System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, data.length); + } catch (Exception e) { +isCorrupt = true; +logger.info("Error while parsing PCAP data: ", e.getMessage()); Review comment: I think log info will produce error for each corrupt row and log file can grow enormously. I guess this should be debug, you can also include trace for the full exception: ``` String message = "Error while parsing PCAP data: {}"; logger.debug(message, e.getMessage()); logger.trace(message, e); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Ignore corrupt rows in a PCAP file > -- > > Key: DRILL-7032 > URL: https://issues.apache.org/jira/browse/DRILL-7032 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: OS: Ubuntu 18.4 > Drill version: 1.15.0 > Java(TM) SE Runtime Environment (build 1.8.0_191-b12) >Reporter: Giovanni Conte >Assignee: Charles Givre >Priority: Major > Fix For: 1.16.0 > > > Would be useful for Drill to have some ability to ignore corrupt rows in a > PCAP file instead of trow the java exception. > This is because there are many pcap files with corrupted lines and this > funcionality will avoid to do a pre-fixing of the packet-captures (example > attached file). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file
[ https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800453#comment-16800453 ] ASF GitHub Bot commented on DRILL-7032: --- arina-ielchiieva commented on pull request #1637: DRILL-7032: Ignore corrupt rows in a PCAP file URL: https://github.com/apache/drill/pull/1637#discussion_r268515281 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/decoder/Packet.java ## @@ -324,7 +333,12 @@ public int getDst_port() { byte[] data = null; if (packetLength >= payloadDataStart) { data = new byte[packetLength - payloadDataStart]; - System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, data.length); + try { +System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, data.length); + } catch (Exception e) { +isCorrupt = true; +logger.info("Error while parsing PCAP data: ", e.getMessage()); Review comment: I think log info will produce error for each corrupt row and log file can grow enormously. I guess this should be debug, you can also include trace for the full exception: ``` String message = "Error while parsing PCAP data: "; logger.debug(message, e.getMessage()); logger.trace(message, e); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Ignore corrupt rows in a PCAP file > -- > > Key: DRILL-7032 > URL: https://issues.apache.org/jira/browse/DRILL-7032 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: OS: Ubuntu 18.4 > Drill version: 1.15.0 > Java(TM) SE Runtime Environment (build 1.8.0_191-b12) >Reporter: Giovanni Conte >Assignee: Charles Givre >Priority: Major > Fix For: 1.16.0 > > > Would be useful for Drill to have some ability to ignore corrupt rows in a > PCAP file instead of trow the java exception. > This is because there are many pcap files with corrupted lines and this > funcionality will avoid to do a pre-fixing of the packet-captures (example > attached file). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file
[ https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800398#comment-16800398 ] ASF GitHub Bot commented on DRILL-7032: --- cgivre commented on pull request #1637: DRILL-7032: Ignore corrupt rows in a PCAP file URL: https://github.com/apache/drill/pull/1637#discussion_r268485150 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/decoder/Packet.java ## @@ -53,6 +53,7 @@ private int packetLength; protected int etherProtocol; protected int protocol; + protected boolean isCorrupt = false; Review comment: Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Ignore corrupt rows in a PCAP file > -- > > Key: DRILL-7032 > URL: https://issues.apache.org/jira/browse/DRILL-7032 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: OS: Ubuntu 18.4 > Drill version: 1.15.0 > Java(TM) SE Runtime Environment (build 1.8.0_191-b12) >Reporter: Giovanni Conte >Assignee: Charles Givre >Priority: Major > Fix For: 1.16.0 > > > Would be useful for Drill to have some ability to ignore corrupt rows in a > PCAP file instead of trow the java exception. > This is because there are many pcap files with corrupted lines and this > funcionality will avoid to do a pre-fixing of the packet-captures (example > attached file). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file
[ https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799898#comment-16799898 ] ASF GitHub Bot commented on DRILL-7032: --- cgivre commented on issue #1637: DRILL-7032: Ignore corrupt rows in a PCAP file URL: https://github.com/apache/drill/pull/1637#issuecomment-475925369 I created a new PR to duplicate this functionality in the PCAP-NG plugin. https://issues.apache.org/jira/browse/DRILL-7032 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Ignore corrupt rows in a PCAP file > -- > > Key: DRILL-7032 > URL: https://issues.apache.org/jira/browse/DRILL-7032 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: OS: Ubuntu 18.4 > Drill version: 1.15.0 > Java(TM) SE Runtime Environment (build 1.8.0_191-b12) >Reporter: Giovanni Conte >Assignee: Charles Givre >Priority: Major > Fix For: 1.16.0 > > > Would be useful for Drill to have some ability to ignore corrupt rows in a > PCAP file instead of trow the java exception. > This is because there are many pcap files with corrupted lines and this > funcionality will avoid to do a pre-fixing of the packet-captures (example > attached file). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file
[ https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796473#comment-16796473 ] Arina Ielchiieva commented on DRILL-7032: - [~cgivre] did you open Jira for PCAP-NG parser? Can you please link it? > Ignore corrupt rows in a PCAP file > -- > > Key: DRILL-7032 > URL: https://issues.apache.org/jira/browse/DRILL-7032 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: OS: Ubuntu 18.4 > Drill version: 1.15.0 > Java(TM) SE Runtime Environment (build 1.8.0_191-b12) >Reporter: Giovanni Conte >Assignee: Charles Givre >Priority: Major > Fix For: 1.16.0 > > > Would be useful for Drill to have some ability to ignore corrupt rows in a > PCAP file instead of trow the java exception. > This is because there are many pcap files with corrupted lines and this > funcionality will avoid to do a pre-fixing of the packet-captures (example > attached file). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file
[ https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777569#comment-16777569 ] Charles Givre commented on DRILL-7032: -- [~priteshm] While I don't think this code breaks anything, I don't think it's entirely "done" either. Someone reported it not working properly with a PCAP for which the first line is corrupt. I don't have an example of that to test, but I think this should be considered a work in progress and that as we get more examples of corrupted PCAPs, we make the parser more robust. Also, we'll need to add this functionality to the PCAP-NG parser as well. I'll open a separate JIRA for that. > Ignore corrupt rows in a PCAP file > -- > > Key: DRILL-7032 > URL: https://issues.apache.org/jira/browse/DRILL-7032 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: OS: Ubuntu 18.4 > Drill version: 1.15.0 > Java(TM) SE Runtime Environment (build 1.8.0_191-b12) >Reporter: Giovanni Conte >Assignee: Charles Givre >Priority: Major > Fix For: 1.16.0 > > > Would be useful for Drill to have some ability to ignore corrupt rows in a > PCAP file instead of trow the java exception. > This is because there are many pcap files with corrupted lines and this > funcionality will avoid to do a pre-fixing of the packet-captures (example > attached file). -- This message was sent by Atlassian JIRA (v7.6.3#76005)