[jira] [Updated] (HIVE-4005) Column truncation

2013-04-25 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4005:
-

   Resolution: Fixed
Fix Version/s: 0.12.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Kevin

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Fix For: 0.12.0
>
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
> HIVE-4005.6.patch.txt, HIVE-4005.6.patch.txt, HIVE-4005.7.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Status: Patch Available  (was: Open)

Refreshed.

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
> HIVE-4005.6.patch.txt, HIVE-4005.6.patch.txt, HIVE-4005.7.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.6.patch.txt

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
> HIVE-4005.6.patch.txt, HIVE-4005.6.patch.txt, HIVE-4005.7.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.7.patch.txt

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
> HIVE-4005.6.patch.txt, HIVE-4005.6.patch.txt, HIVE-4005.7.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.6.patch.txt

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
> HIVE-4005.6.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4005:
-

Status: Open  (was: Patch Available)

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-03-12 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Status: Patch Available  (was: Open)

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-22 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.5.patch.txt

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-21 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.4.patch.txt

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-20 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4005:
-

Status: Open  (was: Patch Available)

comments

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-20 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Status: Patch Available  (was: Open)

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-20 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.3.patch.txt

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4005:
-

Status: Open  (was: Patch Available)

comments

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-08 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.2.patch.txt

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-08 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Status: Patch Available  (was: Open)

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-08 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.1.patch.txt

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira