[
https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530588#comment-13530588
]
Cheolsoo Park commented on PIG-2553:
------------------------------------
Sorry for the delay. Here are some comments. Please let me know what you think.
- Wouldn't it make more sense to make it public since this property is a public
property, and this variable may be reused somewhere else in the future? Do you
agree?
{code}
private static final String PIG_LOCATION_CHECK_STRICT =
"pig.location.check.strict";
{code}
- Can you check whether {{PIG_LOCATION_CHECK_STRICT}} is enabled before calling
{{getStoreLocIfInvalid(storeOps)}} since then we can avoid calling it when
unnecessary?
{code}
LOStore invalidStore = getStoreLocIfInvalid(storeOps);
if (invalidStore != null &&
"true".equals(pigContext.getProperties().getProperty(PIG_LOCATION_CHECK_STRICT)))
{
throw new RuntimeException("Script contains 2 or more STORE statements
writing to same location : "+ invalidStore.getFileSpec().getFileName());
}
{code}
- Wouldn't it make more sense for {{getStoreLocIfInvalid()}} to return the
filename as {{String}} instead of {{LOStore}}? {{LOStore}} seems unnecessary to
me.
- I am not sure if creating the {{admin}} section in the docs makes sense. Even
if admin sets this property, users always can override it running Pig with
{{-Dpig.location.check.strict=false}}. So I don't think that this property is
different from any other user properties. Can we document it in
{{conf/pig.property}} like we did for other properties? Do you agree?
> Pig shouldn't allow attempts to write multiple relations into same directory
> ----------------------------------------------------------------------------
>
> Key: PIG-2553
> URL: https://issues.apache.org/jira/browse/PIG-2553
> Project: Pig
> Issue Type: Improvement
> Reporter: Dmitriy V. Ryaboy
> Assignee: Prashant Kommireddi
> Attachments: PIG-2553_1.patch, PIG-2553.patch
>
>
> We've seen multiple occasions where users accidentally try to store 2 or more
> different relations to the same destination directory. Currently, this passes
> the Pig planner and fails on MR side due to concurrent attempts to create the
> same part file on the reducer. This is extremely confusing to the user, and
> hard to debug.
> We should instead fail their scripts before they are even submitted, since we
> can identify the erroneous condition from the beginning.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira