Hi,
When I tried to change the hbase table name into other names, from the
default "webpage" into "nutch_test_table" for example, I found there are
one inconsistency between configuration value and the code.
in the nutch-default, it has:
<property>
<name>storage.schema</name>
<value>webpage</value>
<description>This value holds the schema name used for Nutch web db.
Note that Nutch ignores the value in the gora mapping files, and uses
this as the schema name.
</description>
</property>
but in the code:
src/java/org/apache/nutch/storage/StorageUtils.java:
schema = conf.get("storage.schema.webpage", "webpage");
Whenever I changed the name in config, it always will use webpage as schema
and get the following warning:
"
store.HBaseStore - Keyclass and nameclass match but mismatching table names
mappingfile schema is 'nutch_test_table' vs actual schema 'webpage' ,
assuming they are the same.
""
When I change the code into:
Index: src/java/org/apache/nutch/storage/StorageUtils.java
===================================================================
--- src/java/org/apache/nutch/storage/StorageUtils.java (revision 1357962)
+++ src/java/org/apache/nutch/storage/StorageUtils.java (working copy)
@@ -51,7 +51,7 @@
String schema = null;
if (WebPage.class.equals(persistentClass)) {
- schema = conf.get("storage.schema.webpage", "webpage");
+ schema = conf.get("storage.schema", "webpage");
} else if (Host.class.equals(persistentClass)) {
schema = conf.get("storage.schema.host", "host");
} else {
It works well. I think we should change either the code or default config
items. Not sure it's a known problem or it's just not recommended to use
other tables.
Tianwei