Hi Tejas,
I am following this example https://github.com/veggen/nutch-element-selector.
And now I have tried this example without any changes to my fresh source
of Nutch 2.2.
Attached is my patch ( change set) on fresh Nutch 2.2 source.
Kindly review it and please let me know if I am missing something.
Thanks,
Tonny
On Thu, Jun 13, 2013 at 11:19 AM, Tejas Patil <[email protected]>wrote:
> Weird. I would like to have a quick peek into your changes. Maybe you are
> doing something wrong which is hard to predict and figure out by asking
> bunch of questions to you over email. Can you attach a patch file of your
> changes ? Please remove the fluff from it and only keep the bare essential
> things in the patch. Also, if you are working for some company, make sure
> that you attaching some code here should not be against your organisational
> policy.
>
> Thanks,
> Tejas
>
> On Wed, Jun 12, 2013 at 11:03 PM, Tony Mullins <[email protected]
> >wrote:
>
> > I have done this all. Created my plugin's ivy.xml , plugin.xml ,
> build,xml
> > . Added the entry in nutch-site.xml and src>plugin>build.xml.
> > But I am still getting "PluginRuntimeException:
> > java.lang.ClassNotFoundException"
> >
> >
> > Is there any other configuration that I am missing or its Nutch 2.2
> issues
> > ?
> >
> > Thanks,
> > Tony.
> >
> >
> > On Thu, Jun 13, 2013 at 1:09 AM, Tejas Patil <[email protected]
> > >wrote:
> >
> > > Here is the relevant wiki page:
> > > http://wiki.apache.org/nutch/WritingPluginExample
> > >
> > > Although its old, I think that it will help.
> > >
> > >
> > > On Wed, Jun 12, 2013 at 1:01 PM, Sebastian Nagel <
> > > [email protected]
> > > > wrote:
> > >
> > > > Hi Tony,
> > > >
> > > > you have to "register" your plugin in
> > > > src/plugin/build.xml
> > > >
> > > > Does your
> > > > src/plugin/myplugin/plugin.xml
> > > > properly propagate jar file,
> > > > extension point and implementing class?
> > > >
> > > > And, finally, you have to add your plugin
> > > > to the property plugin.includes in nutch-site.xml
> > > >
> > > > Cheers,
> > > > Sebastian
> > > >
> > > > On 06/12/2013 07:48 PM, Tony Mullins wrote:
> > > > > Hi,
> > > > >
> > > > > I am trying simple ParseFilter plugin in Nutch 2.2. And I can build
> > it
> > > > and
> > > > > also the src>plugin>build.xml successfully. But its .jar file is
> not
> > > > being
> > > > > created in my runtime>local>plugins>myplugin directory.
> > > > >
> > > > > And on running
> > > > > "bin/nutch parsechecker http://www.google.nl"
> > > > > I get this error " java.lang.RuntimeException:
> > > > > org.apache.nutch.plugin.PluginRuntimeException:
> > > > > java.lang.ClassNotFoundException:
> > > > > com.xyz.nutch.selector.HtmlElementSelectorFilter"
> > > > >
> > > > > If I go to MyNutch2.2Source/build/myplugin , I can see plugin's jar
> > > with
> > > > > test & classes directory created there. If I copy .jar from here
> and
> > > > paste
> > > > > it to my runtime>local>plugins>myplugin directory with plugin.xml
> > file
> > > > then
> > > > > too I get the same exception of class not found.
> > > > >
> > > > > I have not made any changes in src>plugin>build-plugin.xml.
> > > > >
> > > > > Could you please guide me that what is I am doing wrong here ?
> > > > >
> > > > > Thanks,
> > > > > Tony
> > > > >
> > > >
> > > >
> > >
> >
>
Index: conf/gora.properties
===================================================================
--- conf/gora.properties (revision 1492208)
+++ conf/gora.properties (working copy)
@@ -20,10 +20,10 @@
# Default SqlStore properties #
###############################
-gora.sqlstore.jdbc.driver=org.hsqldb.jdbc.JDBCDriver
-gora.sqlstore.jdbc.url=jdbc:hsqldb:hsql://localhost/nutchtest
-gora.sqlstore.jdbc.user=sa
-gora.sqlstore.jdbc.password=
+# gora.sqlstore.jdbc.driver=org.hsqldb.jdbc.JDBCDriver
+# gora.sqlstore.jdbc.url=jdbc:hsqldb:hsql://localhost/nutchtest
+# gora.sqlstore.jdbc.user=sa
+# gora.sqlstore.jdbc.password=
################################
# Default AvroStore properties #
@@ -60,7 +60,8 @@
# CassandraStore properties #
#############################
-# gora.cassandrastore.servers=localhost:9160
+ gora.cassandrastore.servers=localhost:9160
+ gora.datastore.default=org.apache.gora.cassandra.store.CassandraStore
#######################
# MemStore properties #
Index: conf/nutch-default.xml
===================================================================
--- conf/nutch-default.xml (revision 1492208)
+++ conf/nutch-default.xml (working copy)
@@ -60,7 +60,7 @@
<property>
<name>http.agent.name</name>
- <value></value>
+ <value>MyIYCrawler</value>
<description>HTTP 'User-Agent' request header. MUST NOT be empty -
please set this to a single word uniquely related to your organization.
@@ -79,7 +79,7 @@
<property>
<name>http.robots.agents</name>
- <value>*</value>
+ <value>MyIYCrawler</value>
<description>The agent strings we'll look for in robots.txt files,
comma-separated, in decreasing order of precedence. You should
put the value of http.agent.name as the first agent name, and keep the
@@ -823,7 +823,7 @@
<property>
<name>plugin.folders</name>
- <value>plugins</value>
+ <value>/root/workspace_eclipse_new/Nutch2.2/src/plugin</value>
<description>Directories where nutch plugins are located. Each
element may be a relative or absolute path. If absolute, it is used
as is. If relative, it is searched for on the classpath.</description>
Index: conf/nutch-site.xml.template
===================================================================
--- conf/nutch-site.xml.template (revision 1492208)
+++ conf/nutch-site.xml.template (working copy)
@@ -4,5 +4,77 @@
<!-- Put site-specific property overrides in this file. -->
<configuration>
+<property>
+ <name>storage.data.store.class</name>
+ <value>org.apache.gora.cassandra.store.CassandraStore</value>
+ <description>Default class for storing data</description>
+</property>
+
+<property>
+<name>http.agent.name</name>
+<value>MyIYCrawler</value>
+<description>HTTP 'User-Agent' request header. MUST NOT be empty -
+please set this to a single word uniquely related to your organization.
+</description>
+</property>
+
+<property>
+<name>http.robots.agents</name>
+<value>MyIYCrawler</value>
+<description>The agent strings we'll look for in robots.txt files,
+comma-separated, in decreasing order of precedence. You should
+put the value of http.agent.name as the first agent name, and keep the
+default * at the end of the list. E.g.: BlurflDev,Blurfl,*
+</description>
+</property>
+
+<property>
+ <name>plugin.folders</name>
+ <value>/root/workspace_eclipse_new/Nutch2.2/src/plugin</value>
+ <description>Directories where nutch plugins are located. Each
+ element may be a relative or absolute path. If absolute, it is used
+ as is. If relative, it is searched for on the classpath.</description>
+</property>
+
+<property>
+ <name>parser.html.selector.blacklist</name>
+ <value>footer,div#mngb</value>
+ <description>
+ A comma-delimited list of css like tags to identify the elements which
should
+ NOT be parsed. Use this to tell the HTML parser to ignore the given
elements, e.g. site navigation.
+ It is allowed to only specify the element type (required), and
optional its class name ('.')
+ or ID ('#'). More complex expressions will not be parsed.
+ Valid examples: div.header,span,p#test,div#main,ul,div.footercol
+ Invalid expressions: div#head#part1,#footer,.inner#post
+ Note that the elements and their children will be silently ignored by
the parser,
+ so verify the indexed content with Luke to confirm results.
+ Use either 'parser.html.selector.blacklist' or
'parser.html.selector.whitelist', but not both of them at once. If so,
+ only the whitelist is used.
+ </description>
+</property>
+<property>
+ <name>parser.html.selector.protected_urls</name>
+ <value>http://www.example.com/home</value>
+ <description>Comma separated list of URLs for pages that should be
excluded from element filtering</description>
+</property>
+<property>
+ <name>parser.html.selector.storage_field</name>
+ <value>filtered_content</value>
+ <description>The name of the document field where the filtered content
should be stored</description>
+</property>
+
+<property>
+ <name>plugin.includes</name>
+
<value>protocol-http|urlfilter-regex|parse-(html|tika)|element-selector|index-(basic|anchor)|urlnormalizer-(pass|regex|basic)|scoring-opic</value>
+ <description>
+ Regular expression naming plugin directory names to
+ include. Any plugin not matching this expression is excluded.
+ In any case you need at least include the nutch-extensionpoints
plugin. By
+ default Nutch includes crawling just HTML and plain text via HTTP,
+ and basic indexing and search plugins. In order to use HTTPS please
enable
+ protocol-httpclient, but be aware of possible intermittent problems
with the
+ underlying commons-httpclient library.
+ </description>
+</property>
</configuration>
Index: conf/regex-urlfilter.txt.template
===================================================================
--- conf/regex-urlfilter.txt.template (revision 1492208)
+++ conf/regex-urlfilter.txt.template (working copy)
@@ -36,4 +36,4 @@
-.*(/[^/]+)/[^/]+\1/[^/]+\1/
# accept anything else
-+.
++^http://([a-z0-9]*\.)*lucene.apache.org/
Index: ivy/ivy.xml
===================================================================
--- ivy/ivy.xml (revision 1492208)
+++ ivy/ivy.xml (working copy)
@@ -119,9 +119,9 @@
<dependency org="org.apache.gora" name="gora-accumulo" rev="0.3"
conf="*->default" />
-->
<!-- Uncomment this to use Cassandra as Gora backend. -->
- <!--
+
<dependency org="org.apache.gora" name="gora-cassandra" rev="0.3"
conf="*->default" />
- -->
+
<!--global exclusion -->
<exclude module="ant" />
Index: src/plugin/build.xml
===================================================================
--- src/plugin/build.xml (revision 1492208)
+++ src/plugin/build.xml (working copy)
@@ -58,6 +58,7 @@
<ant dir="urlnormalizer-basic" target="deploy"/>
<ant dir="urlnormalizer-pass" target="deploy"/>
<ant dir="urlnormalizer-regex" target="deploy"/>
+ <ant dir="element-selector" target="deploy" />
<!--
<ant dir="feed" target="deploy"/>
<ant dir="parse-ext" target="deploy"/>
Index: src/plugin/element-selector/build.xml
===================================================================
--- src/plugin/element-selector/build.xml (revision 0)
+++ src/plugin/element-selector/build.xml (revision 0)
@@ -0,0 +1,22 @@
+<?xml version="1.0"?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<project name="element-selector" default="jar-core">
+
+ <import file="../build-plugin.xml"/>
+
+</project>
Index: src/plugin/element-selector/ivy.xml
===================================================================
--- src/plugin/element-selector/ivy.xml (revision 0)
+++ src/plugin/element-selector/ivy.xml (revision 0)
@@ -0,0 +1,41 @@
+<?xml version="1.0" ?>
+
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<ivy-module version="1.0">
+ <info organisation="org.apache.nutch" module="${ant.project.name}">
+ <license name="Apache 2.0"/>
+ <ivyauthor name="Apache Nutch Team" url="http://nutch.apache.org"/>
+ <description>
+ Apache Nutch
+ </description>
+ </info>
+
+ <configurations>
+ <include file="../../../ivy/ivy-configurations.xml"/>
+ </configurations>
+
+ <publications>
+ <!--get the artifact from our module name-->
+ <artifact conf="master"/>
+ </publications>
+
+ <dependencies>
+ </dependencies>
+
+</ivy-module>
Index: src/plugin/element-selector/plugin.xml
===================================================================
--- src/plugin/element-selector/plugin.xml (revision 0)
+++ src/plugin/element-selector/plugin.xml (revision 0)
@@ -0,0 +1,29 @@
+<?xml version="1.0" encoding="UTF-8"?>
+
+<plugin
+ id="element-selector"
+ name="Blacklist and Whitelist Parser and Indexer"
+ version="1.0.0"
+ provider-name="kaqqao">
+
+ <runtime>
+ <library name="element-selector.jar">
+ <export name="*"/>
+ </library>
+ </runtime>
+
+ <extension id="kaqqao.nutch.selector.HtmlElementSelectorIndexer"
+ name="Nutch Blacklist and Whitelist Indexing Filter"
+ point="org.apache.nutch.indexer.IndexingFilter">
+ <implementation id="HtmlElementSelectorIndexer"
+
class="kaqqao.nutch.selector.HtmlElementSelectorIndexer"/>
+ </extension>
+
+ <extension id="kaqqao.nutch.selector.HtmlElementSelectorFilter"
+ name="Nutch Blacklist and Whitelist Parsing Filter"
+ point="org.apache.nutch.parse.ParseFilter">
+ <implementation id="HtmlElementSelectorFilter"
+ class="kaqqao.nutch.selector.HtmlElementSelectorFilter"/>
+ </extension>
+
+</plugin>
Index:
src/plugin/element-selector/src/java/kaqqao/nutch/plugin/selector/HtmlElementSelectorFilter.java
===================================================================
---
src/plugin/element-selector/src/java/kaqqao/nutch/plugin/selector/HtmlElementSelectorFilter.java
(revision 0)
+++
src/plugin/element-selector/src/java/kaqqao/nutch/plugin/selector/HtmlElementSelectorFilter.java
(revision 0)
@@ -0,0 +1,207 @@
+package kaqqao.nutch.plugin.selector;
+
+import org.apache.avro.util.Utf8;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.nutch.parse.HTMLMetaTags;
+import org.apache.nutch.parse.Parse;
+import org.apache.nutch.parse.ParseFilter;
+import org.apache.nutch.storage.WebPage;
+import org.apache.nutch.util.NodeWalker;
+import org.w3c.dom.DocumentFragment;
+import org.w3c.dom.Node;
+import org.w3c.dom.NodeList;
+
+import java.nio.CharBuffer;
+import java.nio.charset.CharacterCodingException;
+import java.nio.charset.Charset;
+import java.nio.charset.CharsetEncoder;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+
+public class HtmlElementSelectorFilter implements ParseFilter {
+
+ private Configuration conf;
+ private String[] blacklist;
+ private String[] whitelist;
+ private String storageField;
+ private Set<String> protectedURLs;
+ private Collection<WebPage.Field> fields = new HashSet<WebPage.Field>();
+
+ @Override
+ public Parse filter(String s, WebPage webPage, Parse parse, HTMLMetaTags
htmlMetaTags, DocumentFragment documentFragment) {
+ DocumentFragment rootToIndex;
+ StringBuilder strippedContent = new StringBuilder();
+ if ((this.whitelist != null) && (this.whitelist.length > 0) &&
!protectedURLs.contains(webPage.getBaseUrl())) {
+ rootToIndex = (DocumentFragment) documentFragment.cloneNode(false);
+ whitelisting(documentFragment, rootToIndex);
+ } else if ((this.blacklist != null) && (this.blacklist.length > 0) &&
!protectedURLs.contains(webPage.getBaseUrl())) {
+ rootToIndex = (DocumentFragment) documentFragment.cloneNode(true);
+ blacklisting(rootToIndex);
+ } else {
+ return parse;
+ }
+
+ getText(strippedContent, rootToIndex);
+ if (storageField == null) {
+ parse.setText(strippedContent.toString());
+ } else {
+ CharsetEncoder encoder = Charset.forName("UTF-8").newEncoder();
+ try {
+ webPage.putToMetadata(new Utf8(storageField),
encoder.encode(CharBuffer.wrap(strippedContent.toString())));
+ } catch (CharacterCodingException e) {
+ e.printStackTrace();
+ }
+ }
+ return parse;
+ }
+
+ private void blacklisting(Node root) {
+ boolean wasStripped = false;
+ String type = root.getNodeName().toLowerCase();
+ String id = null;
+ String className = null;
+ if (root.hasAttributes()) {
+ Node node = root.getAttributes().getNamedItem("id");
+ id = node != null ? node.getNodeValue().toLowerCase() : null;
+
+ node = root.getAttributes().getNamedItem("class");
+ className = node != null ? node.getNodeValue().toLowerCase() :
null;
+ }
+
+ String typeAndId = new
StringBuilder().append(type).append("#").append(id).toString();
+ String typeAndClass = new
StringBuilder().append(type).append(".").append(className).toString();
+
+ boolean inList = false;
+ if ((type != null) && (Arrays.binarySearch(this.blacklist, type) >= 0))
+ inList = true;
+ else if ((type != null) && (id != null) &&
(Arrays.binarySearch(this.blacklist, typeAndId) >= 0))
+ inList = true;
+ else if ((type != null) && (className != null) &&
(Arrays.binarySearch(this.blacklist, typeAndClass) >= 0)) {
+ inList = true;
+ }
+ if (inList) {
+ root.setNodeValue("");
+
+ while (root.hasChildNodes())
+ root.removeChild(root.getFirstChild());
+ wasStripped = true;
+ }
+
+ if (!wasStripped) {
+ NodeList children = root.getChildNodes();
+ if (children != null) {
+ int len = children.getLength();
+ for (int i = 0; i < len; i++) {
+ blacklisting(children.item(i));
+ }
+ }
+ }
+ }
+
+ private void whitelisting(Node pNode, Node newNode) {
+ boolean wasStripped = false;
+ String type = pNode.getNodeName().toLowerCase();
+ String id = null;
+ String className = null;
+ if (pNode.hasAttributes()) {
+ Node node = pNode.getAttributes().getNamedItem("id");
+ id = node != null ? node.getNodeValue().toLowerCase() : null;
+
+ node = pNode.getAttributes().getNamedItem("class");
+ className = node != null ? node.getNodeValue().toLowerCase() :
null;
+ }
+
+ String typeAndId = new
StringBuilder().append(type).append("#").append(id).toString();
+ String typeAndClass = new
StringBuilder().append(type).append(".").append(className).toString();
+
+ boolean inList = false;
+ if ((type != null) && (Arrays.binarySearch(this.whitelist, type) >= 0))
+ inList = true;
+ else if ((type != null) && (id != null) &&
(Arrays.binarySearch(this.whitelist, typeAndId) >= 0))
+ inList = true;
+ else if ((type != null) && (className != null) &&
(Arrays.binarySearch(this.whitelist, typeAndClass) >= 0)) {
+ inList = true;
+ }
+ if (inList) {
+ newNode.appendChild(pNode.cloneNode(true));
+ wasStripped = true;
+ }
+
+ if (!wasStripped) {
+ NodeList children = pNode.getChildNodes();
+ if (children != null) {
+ int len = children.getLength();
+ for (int i = 0; i < len; i++) {
+ whitelisting(children.item(i), newNode);
+ }
+ }
+ }
+ }
+
+ private void getText(StringBuilder sb, Node node) {
+ NodeWalker walker = new NodeWalker(node);
+
+ while (walker.hasNext()) {
+ Node currentNode = walker.nextNode();
+ String nodeName = currentNode.getNodeName();
+ short nodeType = currentNode.getNodeType();
+
+ if ("script".equalsIgnoreCase(nodeName)) {
+ walker.skipChildren();
+ }
+ if ("style".equalsIgnoreCase(nodeName)) {
+ walker.skipChildren();
+ }
+ if (nodeType == 8) {
+ walker.skipChildren();
+ }
+ if (nodeType == 3) {
+ String text = currentNode.getNodeValue();
+ text = text.replaceAll("\\s+", " ");
+ text = text.trim();
+ if (text.length() > 0) {
+ if (sb.length() > 0) sb.append(' ');
+ sb.append(text);
+ }
+ }
+ }
+ }
+
+ public void setConf(Configuration conf) {
+ this.conf = conf;
+
+ this.blacklist = null;
+ String elementsToExclude =
getConf().get("parser.html.selector.blacklist", null);
+ if ((elementsToExclude != null) && (elementsToExclude.trim().length()
> 0)) {
+ elementsToExclude = elementsToExclude.toLowerCase();
+
+ this.blacklist = elementsToExclude.split(",");
+ Arrays.sort(this.blacklist);
+ }
+
+ this.whitelist = null;
+ String elementsToInclude =
getConf().get("parser.html.selector.whitelist", null);
+ if ((elementsToInclude != null) && (elementsToInclude.trim().length()
> 0)) {
+ elementsToInclude = elementsToInclude.toLowerCase();
+
+ this.whitelist = elementsToInclude.split(",");
+ Arrays.sort(this.whitelist);
+ }
+
+ this.storageField =
getConf().get("parser.html.selector.storage_field", null);
+
+ this.protectedURLs = new
HashSet<String>(Arrays.asList(getConf().get("parser.html.selector.protected_urls",
"").split(",")));
+ }
+
+ @Override
+ public Configuration getConf() {
+ return this.conf;
+ }
+
+ @Override
+ public Collection<WebPage.Field> getFields() {
+ return fields;
+ }
+}
Index:
src/plugin/element-selector/src/java/kaqqao/nutch/plugin/selector/HtmlElementSelectorIndexer.java
===================================================================
---
src/plugin/element-selector/src/java/kaqqao/nutch/plugin/selector/HtmlElementSelectorIndexer.java
(revision 0)
+++
src/plugin/element-selector/src/java/kaqqao/nutch/plugin/selector/HtmlElementSelectorIndexer.java
(revision 0)
@@ -0,0 +1,54 @@
+package kaqqao.nutch.plugin.selector;
+
+import org.apache.avro.util.Utf8;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.nutch.indexer.IndexingException;
+import org.apache.nutch.indexer.IndexingFilter;
+import org.apache.nutch.indexer.NutchDocument;
+import org.apache.nutch.storage.WebPage;
+
+import java.nio.charset.CharacterCodingException;
+import java.nio.charset.Charset;
+import java.nio.charset.CharsetDecoder;
+import java.util.Collection;
+import java.util.HashSet;
+
+public class HtmlElementSelectorIndexer implements IndexingFilter {
+
+ private Configuration conf;
+ private String storageField;
+
+ @Override
+ public NutchDocument filter(NutchDocument document, String s, WebPage
webPage) throws IndexingException {
+ if (storageField != null) {
+ CharsetDecoder decoder = Charset.forName("UTF-8").newDecoder();
+ try {
+ String strippedContent =
decoder.decode(webPage.getFromMetadata(new Utf8(storageField))).toString();
+ if (strippedContent != null) {
+ document.add(storageField, strippedContent);
+ }
+ } catch (CharacterCodingException e) {
+ e.printStackTrace();
+ }
+ }
+
+ return document;
+ }
+
+ @Override
+ public void setConf(Configuration entries) {
+ this.conf = entries;
+
+ this.storageField =
getConf().get("parser.html.selector.storage_field", null);
+ }
+
+ @Override
+ public Configuration getConf() {
+ return this.conf;
+ }
+
+ @Override
+ public Collection<WebPage.Field> getFields() {
+ return new HashSet<WebPage.Field>();
+ }
+}
Index: urls/seed.txt
===================================================================
--- urls/seed.txt (revision 0)
+++ urls/seed.txt (revision 0)
@@ -0,0 +1 @@
+http://lucene.apache.org
\ No newline at end of file