[
https://issues.apache.org/jira/browse/ATLAS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aditya Gupta updated ATLAS-5148:
--------------------------------
Description:
{{Table groupby(createTime) select owner, name, max(createTime)}}
I tried running this command in DSL search but it would return 0 results
despite having data from quick_start.py
However these subset queries worked
{{Table groupby(createTime)}}
{{Table select owner, name, max(createTime)}}
*Problem Description (Before Fix)*
+1. GROUP BY works only for String attributes+
Query:
{color:#ff0000}Table groupby(owner){color}
Works
Query:
{color:#ff0000}Table groupby(createTime){color}
Returns no records
Both owner (String) and createTime (Date/Long) attributes exist on the same
hive_table entities.
+2. GROUP BY with SELECT fails for DATE attributes+
Query:
{color:#ff0000}Table groupby(createTime) select owner, name,
max(createTime){color}
Returns no records
Similar query using a String attribute works as expected:
{color:#ff0000}Table groupby(owner) select owner, count(){color}
Works
+3. Root Cause+
GROUP BY keys were handled only for String data types
When grouping by non-String attributes (Date/Long), valid group keys were
dropped during result processing
This resulted in empty results even though grouping succeeded internally
*Fix Description (After Fix)*
+1. Handle GROUP BY keys of any data type+
Updated method:
{color:#ff0000}AtlasJanusGraphTraversal.getAtlasVertexMap(){color}
Logic change:
{color:#ff0000}String keyStr = (key instanceof String) ? (String) key :
String.valueOf(key);{color}
This converts GROUP BY keys of type Long, Date, Integer, etc. into String
Prevents valid GROUP BY results from being dropped
+2. GROUP BY on DATE attributes now works correctly+
Query:
{color:#ff0000}Table groupby(createTime){color}
Returns grouped results
+3. DSL queries now behave as documented+
The following queries now return valid results:
{color:#ff0000}Table groupby(createTime) select owner, name,
min(createTime){color}
{color:#ff0000}Table groupby(createTime) select owner, name, max(createTime)
{color}
{color:#ff0000}
{color:#172b4d}Doc Attached:{color}{color}
{color:#ff0000}
{color:#172b4d}https://docs.google.com/document/d/1fgbj38CQ0qQB3dc_Y5t8HE-tjaQMWo7O3XB0memhwaU/edit?usp=sharing{color}
{color}
was:
{{Table groupby(createTime) select owner, name, max(createTime)}}
I tried running this command in DSL search but it would return 0 results
despite having data from quick_start.py
However these subset queries worked
{{Table groupby(createTime)}}
{{Table select owner, name, max(createTime)}}
*Problem Description (Before Fix)*
+1. GROUP BY works only for String attributes+
Query:
{color:#FF0000}Table groupby(owner){color}
Works
Query:
{color:#FF0000}Table groupby(createTime){color}
Returns no records
Both owner (String) and createTime (Date/Long) attributes exist on the same
hive_table entities.
+2. GROUP BY with SELECT fails for DATE attributes+
Query:
{color:#FF0000}Table groupby(createTime) select owner, name,
max(createTime){color}
Returns no records
Similar query using a String attribute works as expected:
{color:#FF0000}Table groupby(owner) select owner, count(){color}
Works
+3. Root Cause+
GROUP BY keys were handled only for String data types
When grouping by non-String attributes (Date/Long), valid group keys were
dropped during result processing
This resulted in empty results even though grouping succeeded internally
*Fix Description (After Fix)*
+1. Handle GROUP BY keys of any data type+
Updated method:
{color:#FF0000}AtlasJanusGraphTraversal.getAtlasVertexMap(){color}
Logic change:
{color:#FF0000}String keyStr = (key instanceof String) ? (String) key :
String.valueOf(key);{color}
This converts GROUP BY keys of type Long, Date, Integer, etc. into String
Prevents valid GROUP BY results from being dropped
+2. GROUP BY on DATE attributes now works correctly+
Query:
{color:#FF0000}Table groupby(createTime){color}
Returns grouped results
+3. DSL queries now behave as documented+
The following queries now return valid results:
{color:#FF0000}Table groupby(createTime) select owner, name,
min(createTime){color}
{color:#FF0000}Table groupby(createTime) select owner, name,
max(createTime){color}
> This example DSL query given in documentation doesn't work
> ----------------------------------------------------------
>
> Key: ATLAS-5148
> URL: https://issues.apache.org/jira/browse/ATLAS-5148
> Project: Atlas
> Issue Type: Bug
> Reporter: Rahul Kurup
> Assignee: Aditya Gupta
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
>
> {{Table groupby(createTime) select owner, name, max(createTime)}}
> I tried running this command in DSL search but it would return 0 results
> despite having data from quick_start.py
> However these subset queries worked
> {{Table groupby(createTime)}}
> {{Table select owner, name, max(createTime)}}
>
>
> *Problem Description (Before Fix)*
> +1. GROUP BY works only for String attributes+
> Query:
> {color:#ff0000}Table groupby(owner){color}
> Works
> Query:
> {color:#ff0000}Table groupby(createTime){color}
> Returns no records
> Both owner (String) and createTime (Date/Long) attributes exist on the same
> hive_table entities.
> +2. GROUP BY with SELECT fails for DATE attributes+
> Query:
> {color:#ff0000}Table groupby(createTime) select owner, name,
> max(createTime){color}
> Returns no records
> Similar query using a String attribute works as expected:
> {color:#ff0000}Table groupby(owner) select owner, count(){color}
> Works
> +3. Root Cause+
> GROUP BY keys were handled only for String data types
> When grouping by non-String attributes (Date/Long), valid group keys were
> dropped during result processing
> This resulted in empty results even though grouping succeeded internally
>
> *Fix Description (After Fix)*
> +1. Handle GROUP BY keys of any data type+
> Updated method:
> {color:#ff0000}AtlasJanusGraphTraversal.getAtlasVertexMap(){color}
> Logic change:
> {color:#ff0000}String keyStr = (key instanceof String) ? (String) key :
> String.valueOf(key);{color}
> This converts GROUP BY keys of type Long, Date, Integer, etc. into String
> Prevents valid GROUP BY results from being dropped
> +2. GROUP BY on DATE attributes now works correctly+
> Query:
> {color:#ff0000}Table groupby(createTime){color}
> Returns grouped results
> +3. DSL queries now behave as documented+
> The following queries now return valid results:
> {color:#ff0000}Table groupby(createTime) select owner, name,
> min(createTime){color}
> {color:#ff0000}Table groupby(createTime) select owner, name, max(createTime)
> {color}
> {color:#ff0000}
> {color:#172b4d}Doc Attached:{color}{color}
> {color:#ff0000}
> {color:#172b4d}https://docs.google.com/document/d/1fgbj38CQ0qQB3dc_Y5t8HE-tjaQMWo7O3XB0memhwaU/edit?usp=sharing{color}
> {color}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)