Enrico D'Urso created GRIFFIN-164:
-------------------------------------

             Summary: Make 'Regular expression detection count' available in UI
                 Key: GRIFFIN-164
                 URL: https://issues.apache.org/jira/browse/GRIFFIN-164
             Project: Griffin (Incubating)
          Issue Type: Improvement
    Affects Versions: 0.1.6-incubating
            Reporter: Enrico D'Urso


Hi,

I have been playing for one month now with Griffin.
Given my experience, some companies (included the one am working for as a 
consultant) prefer doing stuff using UI.

Personally, I find very useful the following feature:

 
 * Regular expression detection count

which is, I have a column which should contain just numbers so I want to check 
if my ETL process, wrongly, has populated my table with non-numeric values.

I have been able to run such a job creating my self the right config.json, in 
particular, using spark-sql as dialect:
{code:java}
select count(*) from src where account_id rlike [^0-9]  
{code}
I saw that in pr.component.ts there is a commented line of code:
{code:java}
// {"id":10,"itemName":"Regular Expression Detection Count","category": 
"Advanced Statistics"}
{code}
which I think is what I am talking about.

Also, I can read:
{code:java}
// case 'Regular Expression Detection Count': // return 
'count(source.`'+col.name+'`) where source.`'+col.name+'` LIKE ';
{code}
which should be the griffin-dsl dialect, even if, probably, the regex should be 
added just after LIKE.

Then, once that the above griffin-dsl statement is available in the backend, 

ProfilingRulePlanTrans class

should map that into 'rlike' Spark-sql clause.

Am not sure where (and if) ProfilingRulePlanTrans should be modified as 

preGroupbyClause should contains everything, but I do not have enough knowledge 
about it.

 

Please judge yourself the priority of such a feature, which knowing well the 
code, should not be too hard to make.

Thanks,

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to