[
https://issues.apache.org/jira/browse/SPARK-20248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon resolved SPARK-20248.
----------------------------------
Resolution: Incomplete
> Spark SQL add limit parameter to enhance the reliability.
> ---------------------------------------------------------
>
> Key: SPARK-20248
> URL: https://issues.apache.org/jira/browse/SPARK-20248
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 2.1.0
> Environment: 2.1.0
> Reporter: shaolinliu
> Priority: Minor
> Labels: bulk-closed
>
> When we using thrift server, it is difficult to constrain the user's sql
> statement;
> When the user query a large table without limit, this will lead to thrift
> server process memory occupancy lead to service instability;
> In general, the user is not used correctly, because if you really need to
> return the whole table:
> 1, if you use this data to compute , you can complete the computation in
> the cluster and then return
> 2, if you want obtain the data, you can store it in hdfs
> For the above scene, it is recommended to add a
> "spark.sql.thriftserver.retainedResults" parameter,
> 1, when it is 0, we don not restrict user's operation
> 2, when it is greater than 0, if user query with limit, we use user's
> limit;if not we use this to limit query's result
> Priority user's limit is because, if the user consider the limit, in
> general, the user is aware of the exact meaning of this query
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]