[
https://issues.apache.org/jira/browse/MAHOUT-395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Drew Farris updated MAHOUT-395:
-------------------------------
Status: Resolved (was: Patch Available)
Assignee: Drew Farris
Fix Version/s: 0.4
Resolution: Fixed
applied in r944550, with minor revisions.
> Using KMeansDriver leaves open files and can lead to FileNotFoundException -
> "too many open files" error
> --------------------------------------------------------------------------------------------------------
>
> Key: MAHOUT-395
> URL: https://issues.apache.org/jira/browse/MAHOUT-395
> Project: Mahout
> Issue Type: Bug
> Components: Clustering
> Affects Versions: 0.1, 0.2, 0.3, 0.4
> Reporter: Scott Ganyo
> Assignee: Drew Farris
> Priority: Critical
> Fix For: 0.4
>
> Attachments: KMeansDriver.patch
>
>
> KMeansDriver uses isConverged() method to determine if the k-means clustering
> run is complete. isConverged() has to open each SequenceFIle and read each
> cluster to see if the containing cluster is converged. During this process
> the readers are not explicitly closed, so in the case where there are a large
> number of sequence files opened, the driving system may run out of file
> handles before they are eventually implicitly reclaimed. I'm attaching a
> patch that explicitly closes these files as they are no longer needed to
> remain open.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.