[ 
https://issues.apache.org/jira/browse/HIVE-10722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563782#comment-14563782
 ] 

Sushanth Sowmyan commented on HIVE-10722:
-----------------------------------------

I don't know that I see it as pointless - I see value in "throw" as a setting 
here, as a way of making sure that the process errors if you have invalid 
partitions.

And in that case, I was thinking that we should have "proof" that for the 
strict case, we show that we fail for sure and not silently succeed. The 
problem with not having a negative test in this case is if someone modifies 
this later on, and makes the "throw" behaviour also succeed. This is possible, 
since in your code, you do not explicitly check for "throw", you explicitly 
check for "skip" and "ignore", and default to "throw" if it doesn't follow that 
path. And with the lack of explicit checking or comments there, it is easy for 
someone modifying it later to not realize that that serves a purpose.

I think I'd still very much like a test case for this, but that can be handled 
in a separate jira, and can be taken on at a later date. If you're okay with 
that, I'm +1.

> external table creation with msck in Hive can create unusable partition
> -----------------------------------------------------------------------
>
>                 Key: HIVE-10722
>                 URL: https://issues.apache.org/jira/browse/HIVE-10722
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.14.1, 1.0.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>            Priority: Critical
>         Attachments: HIVE-10722.patch
>
>
> There can be directories in HDFS containing unprintable characters; when 
> doing hadoop fs -ls, these characters are not even visible, and can only be 
> seen for example if output is piped thru od.
> When these are loaded via msck, they are stored in e.g. mysql as "?" (literal 
> question mark, findable via LIKE '%?%' in db) and show accordingly in Hive.
> However, datanucleus appears to encode it as %3F; this causes the partition 
> to be unusable - it cannot be dropped, and other operations like drop table 
> get stuck (didn't investigate in detail why; drop table got unstuck as soon 
> as the partition was removed from metastore).
> We should probably have a 2-way option for such cases - error out on load 
> (default), or convert to '?'/drop such characters (and have partition that 
> actually works, too).
> We should also check if partitions with '?' inserted explicitly work at all 
> with datanucleus.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to