[
https://issues.apache.org/jira/browse/GORA-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161775#comment-14161775
]
Sergey Weiss edited comment on GORA-227 at 10/7/14 11:28 AM:
-------------------------------------------------------------
Hello!
I have debugged TestGenerator and, from what I saw, it fails due to the fact
that query is being executed on a different MemStore instance rather than one
that holds injected web pages. That is, when GeneratorJob inits its mapper and
reducer, it creates new instance of MemStore for both. Each of this two
instances hold their internal maps and know nothing about each other and
MemStore created by TestGenerator (and populated with web pages).
What is the best way to address this issue? Should we somehow amend
DataStoreFactory to make it return single instance of MemStore or should all
MemStores share their states? Any suggestions?
was (Author: sweiss):
Hello!
I have debugged TestGenerator and, from what I saw, it fails due to the fact
that query is being executed on a different MemStore instance rather than one
that holds injected web pages. That is, when GeneratorJob inits its mapper and
reducer, it creates new instance of MemStore for both. Each of this two
instances hold their internal map and know nothing about MemStore created by
TestGenerator (and populated with web pages).
What is the best way to address this issue? Should we somehow amend
DataStoreFactory to make it return single instance of MemStore or should all
MemStores share their states? Any suggestions?
> Failing assertions when putting and getting Values using MemStore#execute
> -------------------------------------------------------------------------
>
> Key: GORA-227
> URL: https://issues.apache.org/jira/browse/GORA-227
> Project: Apache Gora
> Issue Type: Sub-task
> Components: gora-core
> Affects Versions: 0.3
> Environment: gora-core 0.3, Nutch 2.x HEAD
> Reporter: Lewis John McGibbney
> Fix For: 0.6
>
>
> Test [0] fails with the following useless logging... I need to DEBUG this
> much more throughly
> {code}
> Testcase: testGenerateHighest took 1.845 sec
> FAILED
> expected:<2> but was:<0>
> junit.framework.AssertionFailedError: expected:<2> but was:<0>
> at
> org.apache.nutch.crawl.TestGenerator.testGenerateHighest(TestGenerator.java:78)
> Testcase: testGenerateHostLimit took 1.207 sec
> FAILED
> expected:<1> but was:<0>
> junit.framework.AssertionFailedError: expected:<1> but was:<0>
> at
> org.apache.nutch.crawl.TestGenerator.testGenerateHostLimit(TestGenerator.java:134)
> Testcase: testGenerateDomainLimit took 1.175 sec
> FAILED
> expected:<1> but was:<0>
> junit.framework.AssertionFailedError: expected:<1> but was:<0>
> at
> org.apache.nutch.crawl.TestGenerator.testGenerateDomainLimit(TestGenerator.java:185)
> Testcase: testFilter took 2.31 sec
> FAILED
> expected:<3> but was:<0>
> junit.framework.AssertionFailedError: expected:<3> but was:<0>
> at
> org.apache.nutch.crawl.TestGenerator.testFilter(TestGenerator.java:239)
> {code}
> However so far I have found commonality in the fact that the tests all use
> the following code:
> {code}
> public static ArrayList<URLWebPage> readContents(DataStore<String,WebPage>
> store,
> Mark requiredMark, String... fields) throws Exception {
> ArrayList<URLWebPage> l = new ArrayList<URLWebPage>();
> Query<String, WebPage> query = store.newQuery();
> if (fields != null) {
> query.setFields(fields);
> }
> Result<String, WebPage> results = store.execute(query);
> while (results.next()) {
> try {
> WebPage page = results.get();
> String url = results.getKey();
> if (page == null)
> continue;
> if (requiredMark != null && requiredMark.checkMark(page) == null)
> continue;
> l.add(new URLWebPage(TableUtil.unreverseUrl(url),
> (WebPage)page.clone()));
> } catch (Exception e) {
> e.printStackTrace();
> }
> }
> return l;
> }
> {code}
> and also that the assertions are all of the type
> {code}
> ArrayList<URLWebPage> fetchList =
> CrawlTestUtil.readContents(webPageStore, Mark.GENERATE_MARK, FIELDS);
> // verify we got right amount of records
> assertEquals(1, fetchList.size());
> {code}
> [0]
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/crawl/TestGenerator.java?view=markup
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)