[ 
https://issues.apache.org/jira/browse/PHOENIX-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511817#comment-14511817
 ] 

Rajeshbabu Chintaguntla commented on PHOENIX-538:
-------------------------------------------------

[~giacomotaylor] [~samarthjain] Thank you very much for reviews.
Uploaded patch addressing the review comments.
bq. How about renaming PCommon to PMetaData or PMetaDataEntity?
Renamed it to PMetaDataEntity.

bq. One potential issue with UDFs is that you don't know if users will hold 
state in their UDF implementation in member variables. Because of that, it's 
probably safest to override the UDFExpression.getDeterminism() method to return 
PER_INVOCATION which will force the client to create a new Expression tree 
anytime a UDF is used (basically, each thread will have its own copy when an 
UPSERT SELECT is running, so member variables wouldn't cause a problem).
[~jamestaylor] if we return PER_INVOCATION then index creation is failing. Any 
other alternative?
bq. I'm not seeing any unit tests around CREATE OR REPLACE FUNCTION. Is that 
implemented yet? How would that deal with unloading an existing jar? Or would 
it be required to place it in a new jar. Is that enforced?
I will handle replace as part of PHOENIX-1889.
[~samarthjain] 
I have handled the comments you have pointed. 
bq. In general, I think the synchronization model could be simplified. For 
example - I don't think you need the KeyLocker because you are already using a 
concurrent map. Also there is a bug in your code (added as comment). See the 
code snippet below that simplifies the synchronization model:
Just want to clarify one thing here the map is concurrent map so get and 
putifAbsent will be atomic but there is a chance like between get and put if 
other client already puts a class loader for the tenant then the classloader 
and configuration object created become garbage. That's why using locker. If 
that's ok I can do as you suggested. What do u say?

Apart from added tests for temporary functions in the patch.
[~samarthjain] Sure I will create pull request now.
Thanks.




> Support UDFs
> ------------
>
>                 Key: PHOENIX-538
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-538
>             Project: Phoenix
>          Issue Type: Task
>            Reporter: James Taylor
>            Assignee: Rajeshbabu Chintaguntla
>             Fix For: 5.0.0, 4.4.0
>
>         Attachments: PHOENIX-538-wip.patch, PHOENIX-538_v1.patch, 
> PHOENIX-538_v2.patch, PHOENIX-538_v3.patch, PHOENIX-538_v4.patch, 
> PHOENIX-538_v5.patch, PHOENIX-538_v6.patch
>
>
> Phoenix allows built-in functions to be added (as described 
> [here](http://phoenix-hbase.blogspot.com/2013/04/how-to-add-your-own-built-in-function.html))
>  with the restriction that they must be in the phoenix jar. We should improve 
> on this and allow folks to declare new functions through a CREATE FUNCTION 
> command like this:
>       CREATE FUNCTION mdHash(anytype)
>       RETURNS binary(16)
>       LOCATION 'hdfs://path-to-my-jar' 'com.me.MDHashFunction'
> Since HBase supports loading jars dynamically, this would not be too 
> difficult. The function implementation class would be required to extend our 
> ScalarFunction base class. Here's how I could see it being implemented:
> * modify the phoenix grammar to support the new CREATE FUNCTION syntax
> * create a new UTFParseNode class to capture the parse state
> * add a new method to the MetaDataProtocol interface
> * add a new method in ConnectionQueryServices to invoke the MetaDataProtocol 
> method
> * add a new method in MetaDataClient to invoke the ConnectionQueryServices 
> method
> * persist functions in a new "SYSTEM.FUNCTION" table
> * add a new client-side representation to cache functions called PFunction
> * modify ColumnResolver to dynamically resolve a function in the same way we 
> dynamically resolve and load a table
> * create and register a new ExpressionType called UDFExpression
> * at parse time, check for the function name in the built in list first (as 
> is currently done), and if not found in the PFunction cache. If not found 
> there, then use the new UDFExpression as a placeholder and have the 
> ColumnResolver attempt to resolve it at compile time and throw an error if 
> unsuccessful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to