Tim Starling has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/235669

Change subject: Use /document as the path instead of /depurate
......................................................................

Use /document as the path instead of /depurate

This is conceived as a format specifier, for forwards compatibility with
other output formats.

Also add a README.md file for documentation.

Change-Id: I31f060d0025cf7f1a0393d7940d0a7f31e9ccb3a
---
A README.md
M src/main/java/org/wikimedia/html5depurate/DepurateDaemon.java
M src/main/java/org/wikimedia/html5depurate/DepurateHandler.java
3 files changed, 60 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/mediawiki/services/html5depurate 
refs/changes/69/235669/1

diff --git a/README.md b/README.md
new file mode 100644
index 0000000..cb6ac6b
--- /dev/null
+++ b/README.md
@@ -0,0 +1,53 @@
+This is an HTTP frontend for the validator.nu HTML 5 parser. It parses some 
+input text and returns the reserialized HTML.
+
+## Compile and test
+
+Ubuntu build/test dependencies:
+* openjdk-7-jdk
+* maven2
+* jsvc
+
+Compile with `mvn compile`. Then `mvn dependency:build-classpath` will display
+a classpath suitable for testing. Then the daemon can be started with something
+like:
+
+```
+/usr/bin/jsvc \
+       -cp "$classpath":target/classes \
+       -pidfile /tmp/html5depurate.pid \
+       -errfile /tmp/html5depurate.err \
+       -outfile /tmp/html5depurate.out \
+       -procname html5depurate \
+       org.wikimedia.html5depurate.DepurateDaemon
+```
+
+The default log format is pretty bad but can be configured by the usual means,
+with -Djava.util.logging.config.file=/path/to/logging.properties
+
+Then to test:
+
+```
+curl http://localhost:4339/document -F text=foo
+```
+
+This will return an HTML document which is a reserialized version of "foo".
+
+## To do
+
+* Debian packaging
+  - A SysV init script wrapping jsvc should be fairly simple.
+  - Very strong security guarantees are possible by using a security.policy
+    file.
+  - Most Maven dependencies are packaged already, with the exception of the
+    validator.nu parser itself, which needs to be bundled.
+
+* Collect warnings/errors and provide a JSON serialized return format
+  exposed at /info.
+
+* Help out MW a bit by extracting the contents of the body tag. This could be
+  provided at /body.
+
+* A servlet version, if someone needs that. An early version depended on a
+  servlet container, but I abandoned that approach in favour of the robustness
+  and management simplicity of a standalone daemon.
diff --git a/src/main/java/org/wikimedia/html5depurate/DepurateDaemon.java 
b/src/main/java/org/wikimedia/html5depurate/DepurateDaemon.java
index a171c7b..a48d486 100644
--- a/src/main/java/org/wikimedia/html5depurate/DepurateDaemon.java
+++ b/src/main/java/org/wikimedia/html5depurate/DepurateDaemon.java
@@ -96,7 +96,7 @@
                                new NetworkListener("depurate", config.host, 
config.port));
 
                ServerConfiguration serverConf = 
m_server.getServerConfiguration();
-               serverConf.addHttpHandler(new DepurateHandler(config), 
"/depurate");
+               serverConf.addHttpHandler(new DepurateHandler(config), 
"/document", "/body");
                serverConf.setDefaultErrorPageGenerator(new 
DepurateErrorPageGenerator());
                serverConf.setName("depurate");
                m_server.start();
diff --git a/src/main/java/org/wikimedia/html5depurate/DepurateHandler.java 
b/src/main/java/org/wikimedia/html5depurate/DepurateHandler.java
index 40ca68c..69fbed0 100644
--- a/src/main/java/org/wikimedia/html5depurate/DepurateHandler.java
+++ b/src/main/java/org/wikimedia/html5depurate/DepurateHandler.java
@@ -24,6 +24,7 @@
        final private Config m_config;
        Logger m_logger = Logger.getLogger(this.getClass().getName());
 
+
        DepurateHandler(Config config) {
                super("depurate");
                m_config = config;
@@ -35,6 +36,11 @@
        {
                m_logger.finer("Request received");
 
+               if (!request.getContextPath().equals("/document")) {
+                       sendError(response, 404, "Only /document is supported");
+                       return;
+               }
+
                response.suspend();
                request.setCharacterEncoding("UTF-8");
                final MultipartBuffer buf = new 
MultipartBuffer(m_config.maxPostSize);

-- 
To view, visit https://gerrit.wikimedia.org/r/235669
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I31f060d0025cf7f1a0393d7940d0a7f31e9ccb3a
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/services/html5depurate
Gerrit-Branch: master
Gerrit-Owner: Tim Starling <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to