Hi,

#1: It wont support regular expressions. The filtering is done by looking
for substring.
(To be precise, <url string>.indexOf(<whitelist or blacklist entry>). Urls
are matched with the blacklist first and then those remaining are matched
with the whitelist.

#2: The whitelist and blacklists are stored in memory using Java
ArrayLists. The max size of an array list is dependent on the memory
available to JVM. I have never played around with sub-collections so can't
comment on the perf. degradation and optimal #lines.

Thanks,
Tejas Patil


On Sat, Jan 26, 2013 at 2:23 PM, Jason S <[email protected]> wrote:

> Hello,
>
> I have a couple questions about the subcollection plugin.
>
> 1) Is is possible to use a regular expression?  Can I use a string like
> this:
>
> ^https?://(blog|blogs)\.
>
> 2) If I am using individual urls in the whitelist / blacklist filters, how
> many are possible?  Could I have a subcollection configuration with 100K
> lines?  A million lines?  At what point would the plugin break, and / or
> start to degrade in performance?
>
> Thanks in advance,
>
> ~Jason
>

Reply via email to