[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-03-16 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

   Resolution: Fixed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

This is fixed in HADOOP-15209

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15208-001.patch, HADOOP-15208-002.patch, 
> HADOOP-15208-002.patch, HADOOP-15208-003.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-02-14 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Status: Patch Available  (was: Open)

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15208-001.patch, HADOOP-15208-002.patch, 
> HADOOP-15208-002.patch, HADOOP-15208-003.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-02-14 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Status: Open  (was: Patch Available)

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15208-001.patch, HADOOP-15208-002.patch, 
> HADOOP-15208-002.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-02-14 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Attachment: HADOOP-15208-003.patch

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15208-001.patch, HADOOP-15208-002.patch, 
> HADOOP-15208-002.patch, HADOOP-15208-003.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-02-14 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Status: Open  (was: Patch Available)

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15208-001.patch, HADOOP-15208-002.patch, 
> HADOOP-15208-002.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-02-14 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Attachment: HADOOP-15208-002.patch

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15208-001.patch, HADOOP-15208-002.patch, 
> HADOOP-15208-002.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-02-14 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Status: Patch Available  (was: Open)

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15208-001.patch, HADOOP-15208-002.patch, 
> HADOOP-15208-002.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-02-14 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Status: Patch Available  (was: Open)

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15208-001.patch, HADOOP-15208-002.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-02-07 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Status: Open  (was: Patch Available)

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15208-001.patch, HADOOP-15208-002.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-02-07 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Attachment: HADOOP-15208-002.patch

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15208-001.patch, HADOOP-15208-002.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-02-06 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Status: Patch Available  (was: Open)

Tested: S3A US west
Not tested: Azure, Allyun
No ADL test case, interestingly. One should go in, but separately from this

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15208-001.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-02-06 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Attachment: (was: HADOOP-15208-001.patch)

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15208-001.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-02-06 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Attachment: HADOOP-15208-001.patch

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15208-001.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-02-06 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Attachment: HADOOP-15208-001.patch

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15208-001.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org