rzo1 commented on code in PR #1766:
URL: https://github.com/apache/stormcrawler/pull/1766#discussion_r2665188681
##########
external/sql/src/main/java/org/apache/stormcrawler/sql/IndexerBolt.java:
##########
@@ -74,51 +80,25 @@ public void execute(Tuple tuple) {
String normalisedurl = valueForURL(tuple);
Metadata metadata = (Metadata) tuple.getValueByField("metadata");
- String text = tuple.getStringByField("text");
+ tuple.getStringByField("text");
Review Comment:
If the `text` local var is not used, the call can be removed.
##########
external/sql/src/main/java/org/apache/stormcrawler/sql/StatusUpdaterBolt.java:
##########
@@ -253,17 +286,27 @@ private synchronized void checkExecuteBatch() throws
SQLException {
lastInsertBatchTime = System.currentTimeMillis();
currentBatchSize = 0;
waitingAck.clear();
-
- insertPreparedStmt.close();
- insertPreparedStmt = connection.prepareStatement(insertQuery);
}
@Override
public void cleanup() {
Review Comment:
There is a slight risk that this method isn't called in a cluster setup:
> The cleanup method is called when a Bolt is being shutdown and should
cleanup any resources that were opened. There's no guarantee that this method
will be called on the cluster: for example, if the machine the task is running
on blows up, there's no way to invoke the method. The cleanup method is
intended for when you run topologies in [local
mode](https://storm.apache.org/releases/2.8.3/Local-mode.html) (where a Storm
cluster is simulated in a process), and you want to be able to run and kill
many topologies without suffering any resource leaks.
Don't think it would be a huge issue for now.
##########
pom.xml:
##########
@@ -673,6 +674,13 @@ under the License.
<version>${commons.codec.version}</version>
</dependency>
+ <dependency>
+ <groupId>com.mysql</groupId>
+ <artifactId>mysql-connector-j</artifactId>
+ <version>${mysql-connector-j}</version>
+ <scope>test</scope>
Review Comment:
I would drop the scope here.
##########
external/sql/pom.xml:
##########
@@ -37,13 +37,29 @@ under the License.
<description>SQL-based resources for StormCrawler</description>
<dependencies>
- <!--
- <dependency>
- <groupId>mysql</groupId>
- <artifactId>mysql-connector-java</artifactId>
- <version>5.1.31</version>
- </dependency>
- -->
+
+ <dependency>
+ <groupId>org.testcontainers</groupId>
+ <artifactId>junit-jupiter</artifactId>
+ <scope>test</scope>
+ </dependency>
+ <dependency>
+ <groupId>org.testcontainers</groupId>
+ <artifactId>mysql</artifactId>
+ <scope>test</scope>
+ </dependency>
+ <dependency>
+ <groupId>com.mysql</groupId>
+ <artifactId>mysql-connector-j</artifactId>
+ <scope>test</scope>
Review Comment:
I think we should ship with a jdbc driver, so this should likely be not
in`test` scope.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]