David Knupp has posted comments on this change. (
http://gerrit.cloudera.org:8080/15642 )
Change subject: IMPALA-9362: Upgrade sqlparse 0.1.19 -> 0.3.1
......................................................................
Patch Set 6:
> Patch Set 6:
>
> I think we can probably live with a 10-15% regression for extreme cases like
> that, it would be a bit heroic to try to fix it in sqlparse.
I was more wondering if there were things that we could do in our code. It does
seem like sqlparse itself is slower, but just doing something like this before
stripping leading comments:
# Only call sqlparse if indicated
if sql.lstrip() and sql.lstrip()[0] in ('-', '/'):
stack = sqlparse.engine.FilterStack()
strip_leading_comment_filter = StripLeadingCommentFilter()
stack.stmtprocess.append(strip_leading_comment_filter)
stack.postprocess.append(sqlparse.filters.SerializerUnicode())
stripped_line = ''.join(stack.run(sql, 'utf-8'))
return strip_leading_comment_filter.comment, stripped_line
else:
return None, sql.decode('utf-8')
...all the tests still pass, and the time improves:
# sqlparse-0.3.1
Time to parse large sql: 0.771239995956
Time to parse large sql: 0.756343841553
Time to parse large sql: 0.755437135696
Time to parse large sql: 0.760624885559
Time to parse large sql: 0.764199018478
Time to parse large sql: 0.753574132919
Time to parse large sql: 0.75120306015
Time to parse large sql: 0.753540039062
Time to parse large sql: 0.764436006546
I'm happy to do that if you think it's worth it, or I'm happy ship as is if
you're fine with the current patch.
--
To view, visit http://gerrit.cloudera.org:8080/15642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77a1fd5ae311634a18ee04b8c389d8a3f3a6e001
Gerrit-Change-Number: 15642
Gerrit-PatchSet: 6
Gerrit-Owner: David Knupp <[email protected]>
Gerrit-Reviewer: David Knupp <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Wed, 08 Apr 2020 22:13:24 +0000
Gerrit-HasComments: No