[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-08-30 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919742#comment-16919742 ] Wes McKinney commented on ARROW-5995: - Can you invoke {{hdfs dfs -checksum}} using a system call to

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-08-30 Thread Ruslan Kuprieiev (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919311#comment-16919311 ] Ruslan Kuprieiev commented on ARROW-5995: - Btw, [~wesmckinn] [~npr] , what are your thoughts on

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-08-30 Thread Ruslan Kuprieiev (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919309#comment-16919309 ] Ruslan Kuprieiev commented on ARROW-5995: - You are right, such a hackish approach would probably

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-08-29 Thread Max Risuhin (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918903#comment-16918903 ] Max Risuhin commented on ARROW-5995: I think that relying on internal and not documented ( so far I

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-08-29 Thread Ruslan Kuprieiev (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918825#comment-16918825 ] Ruslan Kuprieiev commented on ARROW-5995: - [~Max Risuhin] Nice, so metafiles are indeed out there

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-08-29 Thread Max Risuhin (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918810#comment-16918810 ] Max Risuhin commented on ARROW-5995: Well, storing of small file under /test/test.txt hadoop path

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-08-29 Thread Max Risuhin (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918799#comment-16918799 ] Max Risuhin commented on ARROW-5995: I understand "hidden" as internal implementation details. Will

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-08-29 Thread Ruslan Kuprieiev (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918730#comment-16918730 ] Ruslan Kuprieiev commented on ARROW-5995: - > stores these checksums in a separate hidden file in

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-08-29 Thread Max Risuhin (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918694#comment-16918694 ] Max Risuhin commented on ARROW-5995: [~efiop] > Checksum(md5 of blocks crcs) is always computed on

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-08-29 Thread Ruslan Kuprieiev (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918345#comment-16918345 ] Ruslan Kuprieiev commented on ARROW-5995: - [~Max Risuhin] Thanks for the research! Just a few

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-08-28 Thread Max Risuhin (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918152#comment-16918152 ] Max Risuhin commented on ARROW-5995: Arrow codebase seems supports hdfs access by utilizing 2

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-08-19 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910802#comment-16910802 ] Wes McKinney commented on ARROW-5995: - My comment about libhdfs3 was an aside. The checksum feature

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-07-23 Thread Ruslan Kuprieiev (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891017#comment-16891017 ] Ruslan Kuprieiev commented on ARROW-5995: - [~wesmckinn] I am aware that libhdfs3 is not

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-07-23 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891002#comment-16891002 ] Wes McKinney commented on ARROW-5995: - libhdfs3 is unmaintained software so I am likely to -1 any

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-07-22 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890500#comment-16890500 ] Neal Richardson commented on ARROW-5995: It sounds like you already know more about HDFS

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-07-22 Thread Ruslan Kuprieiev (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890490#comment-16890490 ] Ruslan Kuprieiev commented on ARROW-5995: - Got it :) Sure I would love to contribute a patch, but

[jira] [Commented] (ARROW-5995) [Python] pyarrow: hdfs: support file checksum

2019-07-22 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890469#comment-16890469 ] Neal Richardson commented on ARROW-5995: Most likely there's no intentional reason for it not