[jira] Commented: (HIVE-563) UDF for parsing the URL

Zheng Shao (JIRA) Mon, 15 Jun 2009 22:00:32 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719926#action_12719926
 ]


Zheng Shao commented on HIVE-563:
---------------------------------

Agree with Raghu. While the String comparisons are still OK (I think moving to 
the static hashmap will definitely help but it's optional to do), doing 
"String.split()" is really a big performance hit (this is part of the reason 
that scripting languages are somehow slower - just because people like to use 
String.split() in those languages)

Can we cache "partToExtract" from last call, and avoid doing String.split again 
if the "partToExtract" didn't change (which is the normal case).
Can we do a loop through the query string instead of calling String.split?


> UDF for parsing the URL
> -----------------------
>
>                 Key: HIVE-563
>                 URL: https://issues.apache.org/jira/browse/HIVE-563
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Server Infrastructure
>            Reporter: Suresh Antony
>            Assignee: Suresh Antony
>         Attachments: patch_563.txt, patch_563.txt.1
>
>
> Needs a udf to extract the parts of url from url string. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-563) UDF for parsing the URL

Reply via email to