[
https://issues.apache.org/jira/browse/PIG-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511168#comment-14511168
]
Kavya commented on PIG-4471:
----------------------------
Does the assert check for empty bag work now [~nielsbasjes]
> ASSERT Bag is not empty and/or is within a specified size range.
> ----------------------------------------------------------------
>
> Key: PIG-4471
> URL: https://issues.apache.org/jira/browse/PIG-4471
> Project: Pig
> Issue Type: New Feature
> Components: internal-udfs, parser
> Reporter: Niels Basjes
>
> In PIG-3367 the ASSERT keyword was created.
> The current implementation allows for checking in each record in the bag if
> the value of a column is valid (and fail the job if it is not).
> We did several experiments and found that an empty bag (0 tuples) always
> succeeds. We need to ensure that a bag has been loaded correctly.
> *Proposed enhancements:*
> # Allow the ASSERT statement to check if a bag is empty.
> {code}
> A = LOAD 'data' AS (a0:int,a1:int,a2:int);
> ASSERT A NOT EMPTY, 'The A bag may not be empty';
> {code}
> # Allow the ASSERT statement to check if a bag has more than (or less than) a
> specific number of tuples.
> {code}
> A = LOAD 'data' AS (a0:int,a1:int,a2:int);
> ASSERT SIZE A > 100, 'The A bag is not big enough';
> ASSERT SIZE A < 1000, 'The A bag is too big';
> {code}
> #- For me this may be an approximating implementation. i.e. if I say it must
> have at least 5M tuples then it may still return 'is valid' if it has 4.9M
> tuples.
> NOTE: The syntax I show is just to give an idea on what I want to do.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)