[
https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877198#comment-15877198
]
ASF GitHub Bot commented on DRILL-5258:
---------------------------------------
Github user sohami commented on a diff in the pull request:
https://github.com/apache/drill/pull/752#discussion_r102360734
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/store/mock/MockGroupScanPOP.java
---
@@ -75,20 +76,50 @@
*/
private boolean extended;
+ private ScanStats scanStats = ScanStats.TRIVIAL_TABLE;
@JsonCreator
public MockGroupScanPOP(@JsonProperty("url") String url,
- @JsonProperty("extended") Boolean extended,
@JsonProperty("entries") List<MockScanEntry> readEntries) {
super((String) null);
this.readEntries = readEntries;
this.url = url;
- this.extended = extended == null ? false : extended;
+
+ // Compute decent row-count stats for this mock data source so that
+ // the planner is "fooled" into thinking that this operator wil do
+ // disk I/O.
+
+ int rowCount = 0;
+ int rowWidth = 0;
+ for (MockScanEntry entry : readEntries) {
+ rowCount += entry.getRecords();
+ int width = 0;
+ if (entry.getTypes() == null) {
+ width = 50;
+ } else {
+ for (MockColumn col : entry.getTypes()) {
+ int colWidth = 0;
+ if (col.getWidthValue() == 0) {
+ colWidth = TypeHelper.getSize(col.getMajorType());
+ } else {
+ colWidth = col.getWidthValue();
+ }
+ colWidth *= col.getRepeatCount();
+ width += colWidth;
+ }
+ }
+ rowWidth = Math.max(rowWidth, width);
--- End diff --
`rowWidth` seems to be `maxRowWidth` and `width` is `rowWidth`. Can we
please rename these ?
> Allow "extended" mock tables access from SQL queries
> ----------------------------------------------------
>
> Key: DRILL-5258
> URL: https://issues.apache.org/jira/browse/DRILL-5258
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: 1.10
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Priority: Minor
> Fix For: 1.10
>
>
> DRILL-5152 provided a simple way to generate sample data in SQL using a new,
> simplified version of the mock data generator. This approach is very
> convenient, but is inherently limited. For example, the limited syntax
> available in SQL does not encoding much information about columns such as
> repeat count, data generator or so on. The simple SQL approach does not allow
> generating multiple groups of data.
> However, all these features are present in the original mock data source via
> a special JSON configuration file. Previously, only physical plans could
> access that extended syntax.
> This ticket requests a SQL interface to the extended mock data source:
> {code}
> SELECT * FROM `mock`.`example/mock-options.json`
> {code}
> Mock data source options are always stored as a JSON file. Since the existing
> mock data generator for SQL never uses JSON files, a simple rule is that if
> the table name ends in ".json" then it is a specification, else the
> information is encoded in table and column names.
> The format of the data generation syntax is documented in the mock data
> source classes.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)