[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631691#comment-16631691 ] t oo commented on HIVE-14271: - is this still relevant? > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña >Priority: Major > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665533#comment-15665533 ] Sahil Takiar commented on HIVE-14271: - [~spena] looked more into what we discussed this morning, you are correct, there are two places where the {{FileSinkOperator}} is renaming files. The first happens in the {{commit(FileSystem)}} method, the method is invoked inside each map task. The second happens in the {{jobCloseOp(boolean)}} method, the method is invoked inside HiveServer2. I think we can break this work down into two JIRAs: 1: Eliminate the rename that occurs in HiveServer2 2: Eliminate the rename that occurs inside each map task When running on S3, I can't think of a reason why either would be necessary. I think the first priority will be to eliminate the rename that occurs in HiveServer2 (as you said this morning). > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651862#comment-15651862 ] Sahil Takiar commented on HIVE-14271: - Yes, agree with Steve. Sergio summarized it well. Sounds like this is a reasonable change, [~spena] can you re-open this JIRA. > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651362#comment-15651362 ] Sergio Peña commented on HIVE-14271: Agree with approach #2. If outPath and finalPath are scratch directories, then we can just write directly to finalPath and avoid the rename. [~ste...@apache.org] There is another patch to do S3-to-S3 renames in parallel to speed up the COPY operations (See HIVE-15093) > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651284#comment-15651284 ] Steve Loughran commented on HIVE-14271: --- Strategy 2 will eliminate one rename, which, with rename costs being O(data) is good. However, there's still one rename to go. there's still the overhead of copying the data from scratch to final. This shouldn't be done in the client-side code, as object store COPY operations happen server side; they're what rename() uses. If renames of files in a directory are issued in parallel, then the rename can be significantly speeded up; this works precisely because you can hold open the HTTP connections for the copy calls without much cost in network traffic. > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15649356#comment-15649356 ] Sahil Takiar commented on HIVE-14271: - We might want to consider re-opening this ticket, but changing the original approach. To clarify, right now the FileSinkOperator (FSOP) will always write all its data to a scratch directory. The FSOP first writes to a {{outPaths}} and then renames the data to {{finalPaths}}, but all the data is still under the scratch directory. No data is exposed to users or future ETL jobs yet. There are two different ways to modify this to improve performance on S3: 1: FSOP implements the "direct output committer" strategy (similar to HIVE-1620) and all data is written directly to the final table location, no data is written to a staging file or in the scratch directory. Hive's MoveTask (which runs in HiveServer2) does nothing. 2: FSOP writes data to a scratch directory, but it doesn't write to {{outPaths}} it writes to {{finalPaths}} instead (remember both of these directories are still under the scratch directory). Hive's MoveTask (which runs inside HiveServer2) copies the data from the scratch directory to the final table location. The FSOP writes directly to the final location in the scratch directory, no writing to a temp file is done. This improves performance since it avoids copying data from {{outPaths}} to {{finalPaths}}. For reasons stated in earlier comments, there are a number of issues with approach 1. Implementing approach 2 should be better, and should improve performance significantly. > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15600134#comment-15600134 ] Sahil Takiar commented on HIVE-14271: - Thanks [~ste...@apache.org] that does make sense. > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593138#comment-15593138 ] Steve Loughran commented on HIVE-14271: --- one funny about last-writer-wins is the scenario # executor 1 starts working on part-001 # executor 2 gets starts working on it, also opens stream to part-001 # executor 2 finishes; their work becomes visible # whatever was waiting for part 001 to be ready sets off # executor 1 finishes and overwrites the existing part 001 That needs to be avoided > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593020#comment-15593020 ] Sahil Takiar commented on HIVE-14271: - [~cnauroth], we were actually thinking of implementing a "direct output committer" strategy for Hive (it would be optional of course). Any chance you could expand some more on what the drawbacks of this approach would be? For the issue reported in SPARK-10063, I think you should be able to add a config option that says the file is only closed if the Task was successful. I know there are other concerns with things like speculative execution and task retries, but Hive may be able to overcome those by making sure each task attempt writes to the same file on S3. Since S3 follows a last-writer-wins approach, and each task attempt is idempotent, there should be no data issues (similar approach was taken in HIVE-1620). Thoughts? > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Sergio Peña > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389878#comment-15389878 ] Thomas Poepping commented on HIVE-14271: If we have [HIVE-14270|https://issues.apache.org/jira/browse/HIVE-14270], then it seems like only the first option will be necessary, as all temporary paths will be on HDFS. The "rename" can be changed to a move or copy, giving us only one operation to S3, rather than many. This also avoids the potential downsides Chris describes with direct output committing. > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389835#comment-15389835 ] Chris Nauroth commented on HIVE-14271: -- If I understand correctly, then approach b) sounds like the "direct output committer" strategy that has been discussed in a few other contexts. Please be aware that this is unsafe in the presence of certain kinds of network partitions. It might be a rare case, but the consequences are distastrous: data loss or corruption. For example, Spark highly discourages a direct write strategy. (See SPARK-10063.) > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388230#comment-15388230 ] Abdullah Yousufi commented on HIVE-14271: - Agreed. I'll upload a patch for the second approach shortly. > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination
[ https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388208#comment-15388208 ] Steve Loughran commented on HIVE-14271: --- Given S3 rename is emulated by a recursive copy() + delete(), it's not clear that a copy() operation will provide any performance benefits, and still have the failure conditions of a non-atomic operation. > FileSinkOperator should not rename files to final paths when S3 is the > default destination > -- > > Key: HIVE-14271 > URL: https://issues.apache.org/jira/browse/HIVE-14271 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > > FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished > writing all rows to a temporary path. The problem is that S3 does not support > renaming. > Two options can be considered: > a. Use a copy operation instead. After FileSinkOperator writes all rows to > outPaths, then the commit method will do a copy() call instead of move(). > b. Write row by row directly to the S3 path (see HIVE-1620). This may add > better performance calls, but we should take care of the cleanup part in case > of writing errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)