[ https://issues.apache.org/jira/browse/CONNECTORS-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karl Wright updated CONNECTORS-1364: ------------------------------------ Fix Version/s: ManifoldCF 2.7 > Better bin naming in the Shared Drive Connector > ----------------------------------------------- > > Key: CONNECTORS-1364 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1364 > Project: ManifoldCF > Issue Type: Improvement > Components: JCIFS connector > Affects Versions: ManifoldCF 1.9 > Reporter: Aeham Abushwashi > Assignee: Karl Wright > Fix For: ManifoldCF 2.7 > > > Hello and happy new year! > Bin naming in the Shared Drive Connector makes assumptions that are not > always valid. > As I understand it, Manifold uses bins to prevent overloading data sources. > In the SDC, server name is designated as bin name. All jobs created against a > particular server will be treated as one unit when documents are prioritised, > which can severely disadvantage some jobs (e.g. late starters). > Moreover, this is incompatible with some common enterprise server topologies. > In Windows DFS, which is widely used in large enterprises, what the SDC > thinks of as a server name, isn’t actually a physical resource. It’s a > namespace that can span many servers and shares. In this case, it doesn’t > make sense to throttle simply on the root ‘server’ name. In other > environments, a powerful storage server can be more than capable of handling > high crawl load; overzealous throttling can end up limiting/hurting > Manifold’s performance there. > I’m struggling to find a single solution that fits all so I’m leaning towards > passing in to the repo connection config some sort of server topology flag or > throttling depth flag as a hint that ShareDriveConnector#getBinNames can use > to decide whether the bin name should be server, server+share or > server+share+root_folder. Share and root_folder would need to be explicitly > passed in the repo config too or extracted from the documentIdentifier arg in > getBinNames (assuming it's reliable). > Thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332)