[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Status: Patch Available (was: In Progress) > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.2.patch, HIVE-14404.3.patch, > HIVE-14404.4.patch, HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Attachment: HIVE-14404.4.patch > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.2.patch, HIVE-14404.3.patch, > HIVE-14404.4.patch, HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Status: In Progress (was: Patch Available) > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.2.patch, HIVE-14404.3.patch, > HIVE-14404.4.patch, HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Attachment: (was: HIVE-14404.4.patch) > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.2.patch, HIVE-14404.3.patch, > HIVE-14404.4.patch, HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Release Note: (was: Introduced the new "dvs2" outputformat, which supports multiple characters as delimiter.) Status: Patch Available (was: In Progress) > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.2.patch, HIVE-14404.3.patch, > HIVE-14404.4.patch, HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Attachment: HIVE-14404.4.patch Attached new patch: - Fixed the findings from the review. - Removed the super csv from the beeline.cmd and beeline.sh files since it is not needed any more. - Fixed the output format for cases when the result contains empty or null values. Also added some test cases with empty and null values. > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.2.patch, HIVE-14404.3.patch, > HIVE-14404.4.patch, HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Status: In Progress (was: Patch Available) > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.2.patch, HIVE-14404.3.patch, HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Attachment: (was: HIVE-14404.3.patch) > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.2.patch, HIVE-14404.3.patch, HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Attachment: HIVE-14404.3.patch > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.2.patch, HIVE-14404.3.patch, HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Attachment: HIVE-14404.3.patch Fixed the patch according to the review: - Change the logic for generating the output for DSV formats not to use SuperCSV and support multiple characters as delimiter. - Remove the SuperCSV library from the project, since it is not used anywhere else. > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.2.patch, HIVE-14404.3.patch, HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Attachment: HIVE-14404.2.patch Patch is fixed according to the review: - Display an error message if multiple character delimiter is set with dsv output format. In this case it will fall back to the default dsv delimiter. - Introduced new constant for default dsv2 delimiter to avoid the String<->char conversions. > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.2.patch, HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Status: Patch Available (was: Open) > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Release Note: Introduced the new "dvs2" outputformat, which supports multiple characters as delimiter. Status: Patch Available (was: Open) Introduced a new outputformat (dsv2) which supports multiple characters as delimiter. For generating the dsv, csv2 and tsv2 outputformats, the Super CSV library is used. This library doesn’t support multiple characters as delimiter. Since the same logic is used for generating csv2, tsv2 and dsv outputformats, I decided not to change this logic, rather introduce a new outputformat (dsv2) which supports multiple characters as delimiter. The new dsv2 outputformat has the same escaping logic as the dsv outputformat if the quoting is not disabled. Extended the TestBeeLineWithArgs tests with new test steps which are using multiple characters as delimiter. Main changes in the code: - Changed the SeparatedValuesOutputFormat class to be an abstract class and created two new child classes to separate the logic for single-character and multi-character delimiters: SingleCharSeparatedValuesOutputFormat and MultiCharSeparatedValuesOutputFormat - Kept the methods which are used by both children in the SeparatedValuesOutputFormat and moved the methods specific to the single-character case to the SingleCharSeparatedValuesOutputFormat class. - Didn’t change the logic which was in the SeparatedValuesOutputFormat, only moved some parts to the child class. - Implemented the value escaping and concatenation with the delimiter string in the MultiCharSeparatedValuesOutputFormat. > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-14404: - Attachment: HIVE-14404.patch > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Marta Kuczora > Attachments: HIVE-14404.patch > > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters
[ https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-14404: - Assignee: Peter Vary > Allow delimiterfordsv to use multiple-character delimiters > -- > > Key: HIVE-14404 > URL: https://issues.apache.org/jira/browse/HIVE-14404 > Project: Hive > Issue Type: Improvement >Reporter: Stephen Measmer >Assignee: Peter Vary > > HIVE-5871 allows for reading multiple character delimiters. Would like the > ability to use outputformat=dsv and define multiple character delimiters. > Today delimiterfordsv only uses on character even if multiple are passes. > For example: > when I use: > beeline>!set outputformat dsv > beeline>!set delimiterfordsv "^-^" > I get: > 111201081253106275^31-Oct-2011 > 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1 > > Would like it to be: > 111201081253106275^-^31-Oct-2011 > 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1 > -- This message was sent by Atlassian JIRA (v6.3.4#6332)