[ 
https://issues.apache.org/jira/browse/PIG-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14876190#comment-14876190
 ] 

Rohini Palaniswamy commented on PIG-4673:
-----------------------------------------

Good feature [[email protected]]. Would you be interested in enhancing 
this UDF for better performance in a new jira?

http://stackoverflow.com/questions/7661460/replace-multiple-substrings-at-once/7661573#7661573
 - You can basically compile the Pattern once and cache it (Have a limit on the 
cache if the search strings are variable and not constant)  and do the multiple 
replace in one go. Have seen a lot of jobs suffer in performance because of 
UDFs with regex match and not reusing compiled Pattern. 

> Built In UDF - REPLACE_MULTI : For a given string, search and replace all 
> occurrences of search keys with replacement values. 
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-4673
>                 URL: https://issues.apache.org/jira/browse/PIG-4673
>             Project: Pig
>          Issue Type: New Feature
>          Components: piggybank
>    Affects Versions: site
>            Reporter: Murali Rao
>            Assignee: Murali Rao
>            Priority: Minor
>              Labels: None
>             Fix For: 0.16.0
>
>         Attachments: PIG-4673-1.patch, replace_multi_udf.patch
>
>
> Lets say we have a string = 'A1B2C3D4'. Our objective is to replace A with 1, 
> B with 2, C with 3 and D with 4 to derive 11223344 string. 
> Using existing REPLACE method 
> REPLACE(REPLACE(REPLACE(REPLACE('A1B2C3D4','A','1'),'B','2'),'C','3'),'D','4')
>  
> With proposed UDF : REPLACE_MULTI method
> General Syntax : 
> REPLACE_MULTI ( sourceString,  [  search1#replacement1, ... ] )
> REPLACE_MULTI ( 'A1B2C3D4',  [ 'A'#'1','B'#'2', 'C'#'3', 'D'#'4' ] )
> Advantage : 
>       1. Function calls are reduced. 
>       2. Ease to code and better readable.
>       
> Let me know your thoughts/ inputs on having this UDF in Piggy Bank. Will take 
> this up based on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to