This is an automated email from the ASF dual-hosted git repository.

tballison pushed a commit to branch docs/pipes-updates
in repository https://gitbox.apache.org/repos/asf/tika.git

commit 9cf2de2c3a170cdbbbdaae2b6480b56d81290c94
Author: tallison <[email protected]>
AuthorDate: Mon May 11 11:43:39 2026 -0400

    add file system docs
---
 docs/modules/ROOT/examples/pipes-fs-pipeline.json  |   2 +-
 docs/modules/ROOT/nav.adoc                         |   2 +
 docs/modules/ROOT/pages/pipes/getting-started.adoc |   4 +-
 docs/modules/ROOT/pages/pipes/plugins/index.adoc   | 133 +++++++++++++++++++++
 4 files changed, 139 insertions(+), 2 deletions(-)

diff --git a/docs/modules/ROOT/examples/pipes-fs-pipeline.json 
b/docs/modules/ROOT/examples/pipes-fs-pipeline.json
index 5a7538b141..4b71666add 120000
--- a/docs/modules/ROOT/examples/pipes-fs-pipeline.json
+++ b/docs/modules/ROOT/examples/pipes-fs-pipeline.json
@@ -1 +1 @@
-../../../../tika-pipes/tika-pipes-plugins/tika-pipes-file-system/src/test/resources/config-examples/file-system-pipeline.json
\ No newline at end of file
+../../../../tika-pipes/tika-pipes-integration-tests/src/test/resources/configs/tika-config-basic.json
\ No newline at end of file
diff --git a/docs/modules/ROOT/nav.adoc b/docs/modules/ROOT/nav.adoc
index 979555022a..ef16b190dd 100644
--- a/docs/modules/ROOT/nav.adoc
+++ b/docs/modules/ROOT/nav.adoc
@@ -31,6 +31,8 @@
 ** xref:pipes/unpack-config.adoc[Extracting Embedded Bytes]
 ** xref:pipes/timeouts.adoc[Timeouts]
 ** xref:pipes/cpu-sizing.adoc[Forked-JVM CPU Sizing]
+** xref:pipes/plugins/index.adoc[Plugins]
+*** xref:pipes/plugins/filesystem.adoc[File System]
 * xref:configuration/index.adoc[Configuration]
 ** xref:configuration/parsers/pdf-parser.adoc[PDF Parser]
 ** xref:configuration/parsers/tesseract-ocr-parser.adoc[Tesseract OCR]
diff --git a/docs/modules/ROOT/pages/pipes/getting-started.adoc 
b/docs/modules/ROOT/pages/pipes/getting-started.adoc
index 6ee6c45148..e52e02f1ac 100644
--- a/docs/modules/ROOT/pages/pipes/getting-started.adoc
+++ b/docs/modules/ROOT/pages/pipes/getting-started.adoc
@@ -64,7 +64,9 @@ pipeline:
 ----
 include::example$pipes-fs-pipeline.json[]
 ----
-icon:github[] 
https://github.com/apache/tika/blob/main/tika-pipes/tika-pipes-plugins/tika-pipes-file-system/src/test/resources/config-examples/file-system-pipeline.json[View
 source on GitHub]
+icon:github[] 
https://github.com/apache/tika/blob/main/tika-pipes/tika-pipes-integration-tests/src/test/resources/configs/tika-config-basic.json[View
 source on GitHub]
+
+NOTE: The values shown like `FETCHER_BASE_PATH`, `EMITTER_BASE_PATH`, and 
`PLUGINS_PATHS` are placeholders the integration tests substitute at runtime. 
Replace them with real paths in your own config.
 
 Run it with:
 
diff --git a/docs/modules/ROOT/pages/pipes/plugins/index.adoc 
b/docs/modules/ROOT/pages/pipes/plugins/index.adoc
new file mode 100644
index 0000000000..8542fa2034
--- /dev/null
+++ b/docs/modules/ROOT/pages/pipes/plugins/index.adoc
@@ -0,0 +1,133 @@
+//
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+//
+
+= Pipes Plugins
+
+Tika Pipes is extensible through plugins. Each plugin lives in its own Maven 
module and can implement one or more of the four pipes extension points:
+
+* **Fetcher** — retrieves document bytes from a source.
+* **Emitter** — writes parsed results to a destination.
+* **Iterator** (`PipesIterator`) — enumerates documents to process as 
`FetchEmitTuple` records.
+* **Reporter** (`PipesReporter`) — records per-document processing status.
+
+Many plugins implement more than one (e.g., the S3 plugin provides fetcher, 
emitter, and iterator). The pages below document each plugin once, with one 
section per implemented interface.
+
+== Plugin / Interface Matrix
+
+[cols="2,1,1,1,1"]
+|===
+|Plugin |Fetcher |Emitter |Iterator |Reporter
+
+|xref:pipes/plugins/filesystem.adoc[File System]
+|✓
+|✓
+|✓
+|✓
+
+|xref:pipes/plugins/s3.adoc[Amazon S3]
+|✓
+|✓
+|✓
+|—
+
+|xref:pipes/plugins/gcs.adoc[Google Cloud Storage]
+|✓
+|✓
+|✓
+|—
+
+|xref:pipes/plugins/azblob.adoc[Azure Blob Storage]
+|✓
+|✓
+|✓
+|—
+
+|xref:pipes/plugins/opensearch.adoc[OpenSearch]
+|—
+|✓
+|—
+|✓
+
+|xref:pipes/plugins/elasticsearch.adoc[Elasticsearch]
+|—
+|✓
+|—
+|✓
+
+|xref:pipes/plugins/solr.adoc[Solr]
+|—
+|✓
+|✓
+|—
+
+|xref:pipes/plugins/jdbc.adoc[JDBC]
+|—
+|✓
+|✓
+|✓
+
+|xref:pipes/plugins/kafka.adoc[Kafka]
+|—
+|✓
+|✓
+|—
+
+|xref:pipes/plugins/http.adoc[HTTP]
+|✓
+|—
+|—
+|—
+
+|xref:pipes/plugins/google-drive.adoc[Google Drive]
+|✓
+|—
+|—
+|—
+
+|xref:pipes/plugins/microsoft-graph.adoc[Microsoft Graph]
+|✓
+|—
+|—
+|—
+
+|xref:pipes/plugins/atlassian-jwt.adoc[Atlassian JWT]
+|✓
+|—
+|—
+|—
+
+|xref:pipes/plugins/csv.adoc[CSV]
+|—
+|—
+|✓
+|—
+
+|xref:pipes/plugins/json.adoc[JSON]
+|—
+|—
+|✓
+|—
+|===
+
+== Interface Overviews
+
+For descriptions of the interfaces themselves — their contracts, the shared 
concepts (`FetchKey`, `FetchEmitTuple`, `baseConfig`, etc.), and how they fit 
into a pipeline — see:
+
+* xref:pipes/fetchers.adoc[Fetchers]
+* xref:pipes/emitters.adoc[Emitters]
+* xref:pipes/iterators.adoc[Pipes Iterators]
+* xref:pipes/reporters.adoc[Pipes Reporters]

Reply via email to