Yeah. Hot function deserves a good optimization. It will be great to create an 
issue in JIRA and change your code to fit common Sqoop (Java) code convention. 
E.g. always have {} to wrap even single line code.

Stanley

-----Original Message-----
From: JoeriHermans [mailto:[email protected]] 
Sent: Monday, April 11, 2016 9:50 PM
To: [email protected]
Subject: [GitHub] sqoop pull request: Optimize toAvroIdentifier

GitHub user JoeriHermans opened a pull request:

    https://github.com/apache/sqoop/pull/18

    Optimize toAvroIdentifier

    Our distributed profiler indicated some inefficiencies in the 
AvroUtil.toAvroIdentifier method, more specifically, the use of Regex patterns. 
This can be directly observed from the FlameGraph generated by this profiler 
(https://jhermans.web.cern.ch/jhermans/sqoop_avro_flamegraph.svg). We 
implemented an optimization, and compared this with the original method. On our 
testing machine, the optimization by itself is 230% (on average) more efficient 
compared to the original implementation. We have yet to test how this 
optimization will influence the performance of user jobs.
    
    Any suggestions or remarks are welcome.
    
    
    Kind regards,
    
    Joeri

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoeriHermans/sqoop patch-1

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/sqoop/pull/18.patch

To close this pull request, make a commit to your master/trunk branch with (at 
least) the following in the commit message:

    This closes #18
    
----
commit e8a3eaf872fe9804375c736a6e2603015e3a36a2
Author: Joeri Hermans <[email protected]>
Date:   2016-04-11T13:46:07Z

    Optimize toAvroIdentifier
    
    Our distributed profiler indicated some inefficiencies in the 
AvroUtil.toAvroIdentifier method, more specifically, the use of Regex patterns. 
This can be directly observed from the FlameGraph generated by this profiler 
(https://jhermans.web.cern.ch/jhermans/sqoop_avro_flamegraph.svg). We 
implemented an optimization, and compared this with the original method. On our 
testing machine, the optimization by itself is 230% (on average) more efficient 
compared to the original implementation. We have yet to test how this 
optimization will influence the performance of user jobs.

----


---
If your project is set up for it, you can reply to this email and have your 
reply appear on GitHub as well. If your project does not have this feature 
enabled and wishes so, or if the feature is enabled but not working, please 
contact infrastructure at [email protected] or file a JIRA ticket with 
INFRA.
---

Reply via email to