[jira] [Updated] (HIVE-2309) Incorrect regular expression for extracting task id from filename

2011-07-26 Thread Paul Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Yang updated HIVE-2309:


Attachment: HIVE-2309.1.patch

 Incorrect regular expression for extracting task id from filename
 -

 Key: HIVE-2309
 URL: https://issues.apache.org/jira/browse/HIVE-2309
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.7.1
Reporter: Paul Yang
Priority: Minor
 Attachments: HIVE-2309.1.patch


 For producing the correct filenames for bucketed tables, there is a method in 
 Utilities.java that extracts out the task id from the filename and replaces 
 it with the bucket number. There is a bug in the regex that is used to 
 extract this value for attempt numbers = 10:
 {code}
  re.match(^.*?([0-9]+)(_[0​-9])?(\\..*)?$, 
  'attempt_201107090429_6496​5_m_001210_10').group(1)
 '10'
  re.match(^.*?([0-9]+)(_[0​-9])?(\\..*)?$, 
  'attempt_201107090429_6496​5_m_001210_9').group(1)
 '001210'
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2309) Incorrect regular expression for extracting task id from filename

2011-07-26 Thread Paul Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Yang updated HIVE-2309:


Attachment: HIVE-2309.2.patch

 Incorrect regular expression for extracting task id from filename
 -

 Key: HIVE-2309
 URL: https://issues.apache.org/jira/browse/HIVE-2309
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.7.1
Reporter: Paul Yang
Assignee: Paul Yang
Priority: Minor
 Attachments: HIVE-2309.1.patch, HIVE-2309.2.patch


 For producing the correct filenames for bucketed tables, there is a method in 
 Utilities.java that extracts out the task id from the filename and replaces 
 it with the bucket number. There is a bug in the regex that is used to 
 extract this value for attempt numbers = 10:
 {code}
  re.match(^.*?([0-9]+)(_[0​-9])?(\\..*)?$, 
  'attempt_201107090429_6496​5_m_001210_10').group(1)
 '10'
  re.match(^.*?([0-9]+)(_[0​-9])?(\\..*)?$, 
  'attempt_201107090429_6496​5_m_001210_9').group(1)
 '001210'
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira