[jira] [Updated] (HIVE-4005) Column truncation

2013-04-25 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4005:
-

   Resolution: Fixed
Fix Version/s: 0.12.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Kevin

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.12.0

 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
 HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
 HIVE-4005.6.patch.txt, HIVE-4005.6.patch.txt, HIVE-4005.7.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4005:
-

Status: Open  (was: Patch Available)

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
 HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.6.patch.txt

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
 HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
 HIVE-4005.6.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.6.patch.txt

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
 HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
 HIVE-4005.6.patch.txt, HIVE-4005.6.patch.txt, HIVE-4005.7.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.7.patch.txt

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
 HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
 HIVE-4005.6.patch.txt, HIVE-4005.6.patch.txt, HIVE-4005.7.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Status: Patch Available  (was: Open)

Refreshed.

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
 HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
 HIVE-4005.6.patch.txt, HIVE-4005.6.patch.txt, HIVE-4005.7.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-03-12 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Status: Patch Available  (was: Open)

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
 HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-22 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.5.patch.txt

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
 HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-21 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.4.patch.txt

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
 HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-20 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.3.patch.txt

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
 HIVE-4005.3.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-20 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Status: Patch Available  (was: Open)

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
 HIVE-4005.3.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-20 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4005:
-

Status: Open  (was: Patch Available)

comments

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
 HIVE-4005.3.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4005:
-

Status: Open  (was: Patch Available)

comments

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-08 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.1.patch.txt

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-08 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.2.patch.txt

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4005) Column truncation

2013-02-08 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Status: Patch Available  (was: Open)

 Column truncation
 -

 Key: HIVE-4005
 URL: https://issues.apache.org/jira/browse/HIVE-4005
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt


 Column truncation allows users to remove data for columns that are no longer 
 useful.
 This is done by removing the data for the column and setting the length of 
 the column data and related lengths to 0 in the RC file header.
 RC file was fixed to recognize columns with lengths of zero to be empty and 
 are treated as if the column doesn't exist in the data, a null is returned 
 for every value of that column in every row. This is the same thing that 
 happens when more columns are selected than exist in the file.
 A new command was added to the CLI
 TRUNCATE TABLE ... PARTITION ... COLUMNS ...
 This launches a map only job where each mapper rewrites a single file without 
 the unnecessary column data and the adjusted headers. It does not 
 uncompress/deserialize the data so it is much faster than rewriting the data 
 with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira