[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2018-06-27 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15121:
---
Fix Version/s: (was: 3.1.0)
   3.2.0

Deferring this to 3.2.0 since the branch for 3.1.0 has been cut off.

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HIVE-15121.1.patch, HIVE-15121.2.patch, 
> HIVE-15121.3.patch, HIVE-15121.WIP.1.patch, HIVE-15121.WIP.2.patch, 
> HIVE-15121.WIP.patch, HIVE-15121.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a multi-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.
> The advantage is that any copying of data that needs to be done from the 
> scratch directory to the final table directory can be done server-side, 
> within the blobstore. The MoveTask simply renames data from the scratch 
> directory to the final table location, which should translate to a 
> server-side COPY request. This way HiveServer2 doesn't have to actually copy 
> any data, it just tells the blobstore to do all the work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2018-04-09 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15121:
---
Fix Version/s: (was: 3.0.0)
   3.1.0

Deferring this to 3.1.0 since the branch for 3.0.0 has been cut off. Please 
update the JIRA if you would like to get your patch in 3.0.0.

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-15121.1.patch, HIVE-15121.2.patch, 
> HIVE-15121.3.patch, HIVE-15121.WIP.1.patch, HIVE-15121.WIP.2.patch, 
> HIVE-15121.WIP.patch, HIVE-15121.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a multi-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.
> The advantage is that any copying of data that needs to be done from the 
> scratch directory to the final table directory can be done server-side, 
> within the blobstore. The MoveTask simply renames data from the scratch 
> directory to the final table location, which should translate to a 
> server-side COPY request. This way HiveServer2 doesn't have to actually copy 
> any data, it just tells the blobstore to do all the work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2017-10-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15121:
---
Fix Version/s: (was: 2.3.0)
   3.0.0

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 3.0.0
>
> Attachments: HIVE-15121.1.patch, HIVE-15121.2.patch, 
> HIVE-15121.3.patch, HIVE-15121.patch, HIVE-15121.WIP.1.patch, 
> HIVE-15121.WIP.2.patch, HIVE-15121.WIP.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a multi-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.
> The advantage is that any copying of data that needs to be done from the 
> scratch directory to the final table directory can be done server-side, 
> within the blobstore. The MoveTask simply renames data from the scratch 
> directory to the final table location, which should translate to a 
> server-side COPY request. This way HiveServer2 doesn't have to actually copy 
> any data, it just tells the blobstore to do all the work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2017-03-07 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-15121:
--
Labels:   (was: TODOC2.2)

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 2.2.0
>
> Attachments: HIVE-15121.1.patch, HIVE-15121.2.patch, 
> HIVE-15121.3.patch, HIVE-15121.patch, HIVE-15121.WIP.1.patch, 
> HIVE-15121.WIP.2.patch, HIVE-15121.WIP.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a multi-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.
> The advantage is that any copying of data that needs to be done from the 
> scratch directory to the final table directory can be done server-side, 
> within the blobstore. The MoveTask simply renames data from the scratch 
> directory to the final table location, which should translate to a 
> server-side COPY request. This way HiveServer2 doesn't have to actually copy 
> any data, it just tells the blobstore to do all the work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2016-11-22 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-15121:
--
Labels: TODOC2.2  (was: )

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-15121.1.patch, HIVE-15121.2.patch, 
> HIVE-15121.3.patch, HIVE-15121.WIP.1.patch, HIVE-15121.WIP.2.patch, 
> HIVE-15121.WIP.patch, HIVE-15121.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a multi-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.
> The advantage is that any copying of data that needs to be done from the 
> scratch directory to the final table directory can be done server-side, 
> within the blobstore. The MoveTask simply renames data from the scratch 
> directory to the final table location, which should translate to a 
> server-side COPY request. This way HiveServer2 doesn't have to actually copy 
> any data, it just tells the blobstore to do all the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2016-11-22 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15121:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Thanks [~stakiar] for your contribution. I committed this to master.

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 2.2.0
>
> Attachments: HIVE-15121.1.patch, HIVE-15121.2.patch, 
> HIVE-15121.3.patch, HIVE-15121.WIP.1.patch, HIVE-15121.WIP.2.patch, 
> HIVE-15121.WIP.patch, HIVE-15121.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a multi-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.
> The advantage is that any copying of data that needs to be done from the 
> scratch directory to the final table directory can be done server-side, 
> within the blobstore. The MoveTask simply renames data from the scratch 
> directory to the final table location, which should translate to a 
> server-side COPY request. This way HiveServer2 doesn't have to actually copy 
> any data, it just tells the blobstore to do all the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2016-11-22 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15121:

Description: 
Hive should be able to configure all intermediate MR jobs to write to HDFS, but 
the final MR job to write to S3.

This will be useful for implementing parallel renames on S3. The idea is that 
for a multi-job query, all intermediate MR jobs write to HDFS, and then the 
final job writes to S3. Writing to HDFS should be faster than writing to S3, so 
it makes more sense to write intermediate data to HDFS.

The advantage is that any copying of data that needs to be done from the 
scratch directory to the final table directory can be done server-side, within 
the blobstore. The MoveTask simply renames data from the scratch directory to 
the final table location, which should translate to a server-side COPY request. 
This way HiveServer2 doesn't have to actually copy any data, it just tells the 
blobstore to do all the work.

  was:
Hive should be able to configure all intermediate MR jobs to write to HDFS, but 
the final MR job to write to S3.

This will be useful for implementing parallel renames on S3. The idea is that 
for a mutli-job query, all intermediate MR jobs write to HDFS, and then the 
final job writes to S3. Writing to HDFS should be faster than writing to S3, so 
it makes more sense to write intermediate data to HDFS.


> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15121.1.patch, HIVE-15121.2.patch, 
> HIVE-15121.3.patch, HIVE-15121.WIP.1.patch, HIVE-15121.WIP.2.patch, 
> HIVE-15121.WIP.patch, HIVE-15121.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a multi-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.
> The advantage is that any copying of data that needs to be done from the 
> scratch directory to the final table directory can be done server-side, 
> within the blobstore. The MoveTask simply renames data from the scratch 
> directory to the final table location, which should translate to a 
> server-side COPY request. This way HiveServer2 doesn't have to actually copy 
> any data, it just tells the blobstore to do all the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2016-11-22 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15121:

Attachment: HIVE-15121.3.patch

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15121.1.patch, HIVE-15121.2.patch, 
> HIVE-15121.3.patch, HIVE-15121.WIP.1.patch, HIVE-15121.WIP.2.patch, 
> HIVE-15121.WIP.patch, HIVE-15121.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a mutli-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2016-11-21 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15121:

Attachment: HIVE-15121.2.patch

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15121.1.patch, HIVE-15121.2.patch, 
> HIVE-15121.WIP.1.patch, HIVE-15121.WIP.2.patch, HIVE-15121.WIP.patch, 
> HIVE-15121.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a mutli-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2016-11-07 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15121:

Attachment: HIVE-15121.1.patch

Adding qtest.

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15121.1.patch, HIVE-15121.WIP.1.patch, 
> HIVE-15121.WIP.2.patch, HIVE-15121.WIP.patch, HIVE-15121.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a mutli-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2016-11-04 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15121:

Attachment: HIVE-15121.patch

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
> Attachments: HIVE-15121.WIP.1.patch, HIVE-15121.WIP.2.patch, 
> HIVE-15121.WIP.patch, HIVE-15121.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a mutli-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2016-11-04 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15121:

Attachment: HIVE-15121.WIP.2.patch

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
> Attachments: HIVE-15121.WIP.1.patch, HIVE-15121.WIP.2.patch, 
> HIVE-15121.WIP.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a mutli-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2016-11-03 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15121:

Attachment: HIVE-15121.WIP.1.patch

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
> Attachments: HIVE-15121.WIP.1.patch, HIVE-15121.WIP.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a mutli-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2016-11-03 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15121:

Status: Patch Available  (was: Open)

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
> Attachments: HIVE-15121.WIP.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a mutli-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15121) Last MR job in Hive should be able to write to a different scratch directory

2016-11-03 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15121:

Attachment: HIVE-15121.WIP.patch

Did some initial work on this. Patch works for CTAS and INSERT INTO queries if 
hive.merge.mapfiles = false. If its true, then some where stuff happens: data 
gets written to S3, moved to the local fs, and then moved back to S3. Since 
figuring out how to fix that.

Patch is a WIP.

> Last MR job in Hive should be able to write to a different scratch directory
> 
>
> Key: HIVE-15121
> URL: https://issues.apache.org/jira/browse/HIVE-15121
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
> Attachments: HIVE-15121.WIP.patch
>
>
> Hive should be able to configure all intermediate MR jobs to write to HDFS, 
> but the final MR job to write to S3.
> This will be useful for implementing parallel renames on S3. The idea is that 
> for a mutli-job query, all intermediate MR jobs write to HDFS, and then the 
> final job writes to S3. Writing to HDFS should be faster than writing to S3, 
> so it makes more sense to write intermediate data to HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)