[ https://issues.apache.org/jira/browse/PHOENIX-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353185#comment-14353185 ]
Gabriel Reid commented on PHOENIX-1711: --------------------------------------- {quote} I'm curious about one thing, though - do you think there's any overhead in calling CallRunner.run() per row versus once per batch? {quote} I would assume that it's lightweight enough that it won't make a measurable difference, but that's just my guess. Actually, I was curious about why that the call to CallRunner.run was needed at all in the map method of CSVUpsertExecutor -- at least when it's being used within MapReduce, it should be fine to just set the context classloader once (which is what CallRunner does, from what I see). From what I recall, the reason for having the CallRunner system is to "fix" the context classloader when running inside of JDBC tooling that dynamically loads Phoenix, but I wouldn't think that there would be ever be a problem when running via the bulk loader. > Improve performance of CSV loader > --------------------------------- > > Key: PHOENIX-1711 > URL: https://issues.apache.org/jira/browse/PHOENIX-1711 > Project: Phoenix > Issue Type: Bug > Reporter: James Taylor > Attachments: PHOENIX-1711.patch > > > Here is a break-up of percentage execution time for some of the steps inthe > mapper: > csvParser: 18% > csvUpsertExecutor.execute(ImmutableList.of(csvRecord)): 39% > PhoenixRuntime.getUncommittedDataIterator(conn, true): 9% > while (uncommittedDataIterator.hasNext()): 15% > Read IO & custom processing: 19% > See details here: http://s.apache.org/6rl -- This message was sent by Atlassian JIRA (v6.3.4#6332)