tangdian commented on a change in pull request #3830: [TE] SQL Connector
backend and front end, supporting Presto, MySQL, H2, with sample data in H2
URL: https://github.com/apache/incubator-pinot/pull/3830#discussion_r272286532
##########
File path:
thirdeye/thirdeye-pinot/src/main/java/org/apache/pinot/thirdeye/datasource/pinot/resultset/ThirdEyeDataFrameResultSet.java
##########
@@ -87,6 +91,87 @@ public String getGroupKeyColumnValue(int rowIdx, int
columnIdx) {
return
dataFrame.get(thirdEyeResultSetMetaData.getGroupKeyColumnNames().get(columnIdx)).getString(rowIdx);
}
+ /**
+ * Constructs a {@link ThirdEyeDataFrameResultSet} from any SQL's {@link
java.sql.ResultSet}.
+ *
+ * @param resultSet resultset from SQL query
+ * @param metric the metric the SQL is querying
+ * @param groupByKeys all groupbykeys from query
+ * @param aggGranularity aggregation granualrity of the query
+ * @param timeSpec timeSpec of the query
+ * @return an unified {@link ThirdEyeDataFrameResultSet}
+ */
+ public static ThirdEyeDataFrameResultSet fromSQLResultSet(java.sql.ResultSet
resultSet, String metric,
+ List<String> groupByKeys, TimeGranularity aggGranularity, TimeSpec
timeSpec) throws Exception {
+
+ List<String> groupKeyColumnNames = new ArrayList<>();
+ if (aggGranularity != null &&
!groupByKeys.contains(timeSpec.getColumnName())) {
+ groupKeyColumnNames.add(0, DataFrameUtils.COL_TIME);
+ }
+
+ for (String groupByKey: groupByKeys) {
+ groupKeyColumnNames.add(groupByKey);
+ }
+
+ List<String> metrics = new ArrayList<>();
+ metrics.add(metric);
+ ThirdEyeResultSetMetaData thirdEyeResultSetMetaData =
+ new ThirdEyeResultSetMetaData(groupKeyColumnNames, metrics);
+ // Build the DataFrame
+ List<String> columnNameWithDataType = new ArrayList<>();
+ // Always cast dimension values to STRING type
+
+ for (String groupColumnName :
thirdEyeResultSetMetaData.getGroupKeyColumnNames()) {
+ columnNameWithDataType.add(groupColumnName + ":STRING");
+ }
+
+
columnNameWithDataType.addAll(thirdEyeResultSetMetaData.getMetricColumnNames());
+ DataFrame.Builder dfBuilder = DataFrame.builder(columnNameWithDataType);
+
+ try {
+ int metricColumnCount = metrics.size();
+ int groupByColumnCount = groupKeyColumnNames.size();
+ int totalColumnCount = groupByColumnCount + metricColumnCount;
+
+ outer: while (resultSet.next()) {
+ String[] columnsOfTheRow = new String[totalColumnCount];
+ // GroupBy column value(i.e., dimension values)
+ for (int groupByColumnIdx = 1; groupByColumnIdx <= groupByColumnCount;
groupByColumnIdx++) {
+ String valueString = null;
+ try {
+ valueString = resultSet.getString(groupByColumnIdx);
+ } catch (Exception e) {
+ // Do nothing and subsequently insert a null value to the current
series.
+ }
+ columnsOfTheRow[groupByColumnIdx - 1] = valueString;
+ }
+ // Metric column's value
+ for (int metricColumnIdx = 1; metricColumnIdx <= metricColumnCount;
metricColumnIdx++) {
+ String valueString = null;
+ try {
+ valueString = resultSet.getString(groupByColumnCount +
metricColumnIdx);
+ if (valueString == null) {
+ break outer;
Review comment:
In this case the labeled break is clearer to read and easier to understand
than the two alternatives I can think of: using flag variables or making the
loop into another function. Do you have any better suggestions?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]