Re: HA support for Spark
Spark Streaming essentially does this by saving the DAG of DStreams, which can deterministically regenerate the DAG of RDDs upon recovery from failure. Along with that the progress information (which batches have finished, which batches are queued, etc.) is also saved, so that upon recovery the system can restart from where it was before failure. This was conceptually easy to do because the RDDs are very deterministically generated in every batch. Extending this to a very general Spark program with arbitrary RDD computations is definitely conceptually possible but not that easy to do. On Wed, Dec 10, 2014 at 7:34 PM, Jun Feng Liu liuj...@cn.ibm.com wrote: Right, perhaps also need preserve some DAG information? I am wondering if there is any work around this. [image: Inactive hide details for Sandy Ryza ---2014-12-11 01:36:35---Sandy Ryza sandy.r...@cloudera.com]Sandy Ryza ---2014-12-11 01:36:35---Sandy Ryza sandy.r...@cloudera.com *Sandy Ryza sandy.r...@cloudera.com sandy.r...@cloudera.com* 2014-12-11 01:34 To Jun Feng Liu/China/IBM@IBMCN, cc Reynold Xin r...@databricks.com, dev@spark.apache.org dev@spark.apache.org Subject Re: HA support for Spark I think that if we were able to maintain the full set of created RDDs as well as some scheduler and block manager state, it would be enough for most apps to recover. On Wed, Dec 10, 2014 at 5:30 AM, Jun Feng Liu liuj...@cn.ibm.com wrote: Well, it should not be mission impossible thinking there are so many HA solution existing today. I would interest to know if there is any specific difficult. Best Regards *Jun Feng Liu* IBM China Systems Technology Laboratory in Beijing -- [image: 2D barcode - encoded with contact information] *Phone: *86-10-82452683 * E-mail:* *liuj...@cn.ibm.com* liuj...@cn.ibm.com [image: IBM] BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 China *Reynold Xin r...@databricks.com r...@databricks.com* 2014/12/10 16:30 To Jun Feng Liu/China/IBM@IBMCN, cc dev@spark.apache.org dev@spark.apache.org Subject Re: HA support for Spark This would be plausible for specific purposes such as Spark streaming or Spark SQL, but I don't think it is doable for general Spark driver since it is just a normal JVM process with arbitrary program state. On Wed, Dec 10, 2014 at 12:25 AM, Jun Feng Liu liuj...@cn.ibm.com wrote: Do we have any high availability support in Spark driver level? For example, if we want spark drive can move to another node continue execution when failure happen. I can see the RDD checkpoint can help to serialization the status of RDD. I can image to load the check point from another node when error happen, but seems like will lost track all tasks status or even executor information that maintain in spark context. I am not sure if there is any existing stuff I can leverage to do that. thanks for any suggests Best Regards *Jun Feng Liu* IBM China Systems Technology Laboratory in Beijing -- [image: 2D barcode - encoded with contact information] *Phone: *86-10-82452683 * E-mail:* *liuj...@cn.ibm.com* liuj...@cn.ibm.com [image: IBM] BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 China
Re: [VOTE] Release Apache Spark 1.2.0 (RC2)
+1 (non-binding) Built and tested on Windows 7: cd apache-spark git fetch git checkout v1.2.0-rc2 sbt assembly [warn] ... [warn] [success] Total time: 720 s, completed Dec 11, 2014 8:57:36 AM dir assembly\target\scala-2.10\spark-assembly-1.2.0-hadoop1.0.4.jar 110,361,054 spark-assembly-1.2.0-hadoop1.0.4.jar Ran some of my 1.2 code successfully. Review some docs, looks good. spark-shell.cmd works as expected. Env details: sbtconfig.txt: -Xmx1024M -XX:MaxPermSize=256m -XX:ReservedCodeCacheSize=128m sbt --version sbt launcher version 0.13.1 - -- Madhu https://www.linkedin.com/in/msiddalingaiah -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-2-0-RC2-tp9713p9728.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: HA support for Spark
Interesting, you saying StreamContext checkpoint can regenerate DAG stuff? Best Regards Jun Feng Liu IBM China Systems Technology Laboratory in Beijing Phone: 86-10-82452683 E-mail: liuj...@cn.ibm.com BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 China Tathagata Das tathagata.das1...@gmail.com 2014/12/11 20:20 To Jun Feng Liu/China/IBM@IBMCN, cc Sandy Ryza sandy.r...@cloudera.com, dev@spark.apache.org dev@spark.apache.org, Reynold Xin r...@databricks.com Subject Re: HA support for Spark Spark Streaming essentially does this by saving the DAG of DStreams, which can deterministically regenerate the DAG of RDDs upon recovery from failure. Along with that the progress information (which batches have finished, which batches are queued, etc.) is also saved, so that upon recovery the system can restart from where it was before failure. This was conceptually easy to do because the RDDs are very deterministically generated in every batch. Extending this to a very general Spark program with arbitrary RDD computations is definitely conceptually possible but not that easy to do. On Wed, Dec 10, 2014 at 7:34 PM, Jun Feng Liu liuj...@cn.ibm.com wrote: Right, perhaps also need preserve some DAG information? I am wondering if there is any work around this. Sandy Ryza ---2014-12-11 01:36:35---Sandy Ryza sandy.r...@cloudera.com Sandy Ryza sandy.r...@cloudera.com 2014-12-11 01:34 To Jun Feng Liu/China/IBM@IBMCN, cc Reynold Xin r...@databricks.com, dev@spark.apache.org dev@spark.apache.org Subject Re: HA support for Spark I think that if we were able to maintain the full set of created RDDs as well as some scheduler and block manager state, it would be enough for most apps to recover. On Wed, Dec 10, 2014 at 5:30 AM, Jun Feng Liu liuj...@cn.ibm.com wrote: Well, it should not be mission impossible thinking there are so many HA solution existing today. I would interest to know if there is any specific difficult. Best Regards *Jun Feng Liu* IBM China Systems Technology Laboratory in Beijing -- [image: 2D barcode - encoded with contact information] *Phone: *86-10-82452683 * E-mail:* *liuj...@cn.ibm.com* liuj...@cn.ibm.com [image: IBM] BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 China *Reynold Xin r...@databricks.com r...@databricks.com* 2014/12/10 16:30 To Jun Feng Liu/China/IBM@IBMCN, cc dev@spark.apache.org dev@spark.apache.org Subject Re: HA support for Spark This would be plausible for specific purposes such as Spark streaming or Spark SQL, but I don't think it is doable for general Spark driver since it is just a normal JVM process with arbitrary program state. On Wed, Dec 10, 2014 at 12:25 AM, Jun Feng Liu liuj...@cn.ibm.com wrote: Do we have any high availability support in Spark driver level? For example, if we want spark drive can move to another node continue execution when failure happen. I can see the RDD checkpoint can help to serialization the status of RDD. I can image to load the check point from another node when error happen, but seems like will lost track all tasks status or even executor information that maintain in spark context. I am not sure if there is any existing stuff I can leverage to do that. thanks for any suggests Best Regards *Jun Feng Liu* IBM China Systems Technology Laboratory in Beijing -- [image: 2D barcode - encoded with contact information] *Phone: *86-10-82452683 * E-mail:* *liuj...@cn.ibm.com* liuj...@cn.ibm.com [image: IBM] BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 China
Re: [VOTE] Release Apache Spark 1.2.0 (RC2)
Signatures and checksums are OK. License and notice still looks fine. The plain-vanilla source release compiles with Maven 3.2.1 and passes tests, on OS X 10.10 + Java 8. On Wed, Dec 10, 2014 at 9:08 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.2.0! The tag to be voted on is v1.2.0-rc2 (commit a428c446e2): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=a428c446e23e628b746e0626cc02b7b3cadf588e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc2/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1055/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc2-docs/ Please vote on releasing this package as Apache Spark 1.2.0! The vote is open until Saturday, December 13, at 21:00 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.2.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == What justifies a -1 vote for this release? == This vote is happening relatively late into the QA period, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.1.X, minor regressions, or bugs related to new features will not block this release. == What default changes should I be aware of? == 1. The default value of spark.shuffle.blockTransferService has been changed to netty -- Old behavior can be restored by switching to nio 2. The default value of spark.shuffle.manager has been changed to sort. -- Old behavior can be restored by setting spark.shuffle.manager to hash. == How does this differ from RC1 == This has fixes for a handful of issues identified - some of the notable fixes are: [Core] SPARK-4498: Standalone Master can fail to recognize completed/failed applications [SQL] SPARK-4552: Query for empty parquet table in spark sql hive get IllegalArgumentException SPARK-4753: Parquet2 does not prune based on OR filters on partition columns SPARK-4761: With JDBC server, set Kryo as default serializer and disable reference tracking SPARK-4785: When called with arguments referring column fields, PMOD throws NPE - Patrick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.2.0 (RC2)
+1 Tested on OS X. On Wednesday, December 10, 2014, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.2.0! The tag to be voted on is v1.2.0-rc2 (commit a428c446e2): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=a428c446e23e628b746e0626cc02b7b3cadf588e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc2/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1055/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc2-docs/ Please vote on releasing this package as Apache Spark 1.2.0! The vote is open until Saturday, December 13, at 21:00 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.2.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == What justifies a -1 vote for this release? == This vote is happening relatively late into the QA period, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.1.X, minor regressions, or bugs related to new features will not block this release. == What default changes should I be aware of? == 1. The default value of spark.shuffle.blockTransferService has been changed to netty -- Old behavior can be restored by switching to nio 2. The default value of spark.shuffle.manager has been changed to sort. -- Old behavior can be restored by setting spark.shuffle.manager to hash. == How does this differ from RC1 == This has fixes for a handful of issues identified - some of the notable fixes are: [Core] SPARK-4498: Standalone Master can fail to recognize completed/failed applications [SQL] SPARK-4552: Query for empty parquet table in spark sql hive get IllegalArgumentException SPARK-4753: Parquet2 does not prune based on OR filters on partition columns SPARK-4761: With JDBC server, set Kryo as default serializer and disable reference tracking SPARK-4785: When called with arguments referring column fields, PMOD throws NPE - Patrick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org javascript:; For additional commands, e-mail: dev-h...@spark.apache.org javascript:;
Re: [VOTE] Release Apache Spark 1.2.0 (RC2)
+1 (non-binding). Tested on Ubuntu against YARN. On Thu, Dec 11, 2014 at 9:38 AM, Reynold Xin r...@databricks.com wrote: +1 Tested on OS X. On Wednesday, December 10, 2014, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.2.0! The tag to be voted on is v1.2.0-rc2 (commit a428c446e2): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=a428c446e23e628b746e0626cc02b7b3cadf588e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc2/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1055/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc2-docs/ Please vote on releasing this package as Apache Spark 1.2.0! The vote is open until Saturday, December 13, at 21:00 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.2.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == What justifies a -1 vote for this release? == This vote is happening relatively late into the QA period, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.1.X, minor regressions, or bugs related to new features will not block this release. == What default changes should I be aware of? == 1. The default value of spark.shuffle.blockTransferService has been changed to netty -- Old behavior can be restored by switching to nio 2. The default value of spark.shuffle.manager has been changed to sort. -- Old behavior can be restored by setting spark.shuffle.manager to hash. == How does this differ from RC1 == This has fixes for a handful of issues identified - some of the notable fixes are: [Core] SPARK-4498: Standalone Master can fail to recognize completed/failed applications [SQL] SPARK-4552: Query for empty parquet table in spark sql hive get IllegalArgumentException SPARK-4753: Parquet2 does not prune based on OR filters on partition columns SPARK-4761: With JDBC server, set Kryo as default serializer and disable reference tracking SPARK-4785: When called with arguments referring column fields, PMOD throws NPE - Patrick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org javascript:; For additional commands, e-mail: dev-h...@spark.apache.org javascript:;
Evaluation Metrics for Spark's MLlib
Hi, I would like to contribute to Spark's Machine Learning library by adding evaluation metrics that would be used to gauge the accuracy of a model given a certain features' set. In particular, I seek to contribute the k-fold validation metrics, f-beta metric among others on top of the current MLlib framework available. Please assist in steps I could take to contribute in this manner. Regards, kidynamit -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Evaluation-Metrics-for-Spark-s-MLlib-tp9727.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Where are the docs for the SparkSQL DataTypes?
Michael other Spark SQL junkies, As I read through the Spark API docs, in particular those for the org.apache.spark.sql package, I can't seem to find details about the Scala classes representing the various SparkSQL DataTypes, for instance DecimalType. I find DataType classes in org.apache.spark.sql.api.java, but they don't seem to match the similarly named scala classes. For instance, DecimalType is documented as having a nullary constructor, but if I try to construct an instance of org.apache.spark.sql.DecimalType without any parameters, the compiler complains about the lack of a precisionInfo field, which I have discovered can be passed in as None. Where is all this stuff documented? Alex
Re: Evaluation Metrics for Spark's MLlib
Hi, I'd recommend starting by checking out the existing helper functionality for these tasks. There are helper methods to do K-fold cross-validation in MLUtils: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala The experimental spark.ml API in the Spark 1.2 release (in branch-1.2 and master) has a CrossValidator class which does this more automatically: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala There are also a few evaluation metrics implemented: https://github.com/apache/spark/tree/master/mllib/src/main/scala/org/apache/spark/mllib/evaluation There definitely could be more metrics and/or better APIs to make it easier to evaluate models on RDDs. If you spot such cases, I'd recommend opening up JIRAs for the new features or improvements to get some feedback before sending PRs: https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark Hope this helps looking forward to the contributions! Joseph On Thu, Dec 11, 2014 at 4:41 AM, kidynamit paul.mwanj...@gmail.com wrote: Hi, I would like to contribute to Spark's Machine Learning library by adding evaluation metrics that would be used to gauge the accuracy of a model given a certain features' set. In particular, I seek to contribute the k-fold validation metrics, f-beta metric among others on top of the current MLlib framework available. Please assist in steps I could take to contribute in this manner. Regards, kidynamit -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Evaluation-Metrics-for-Spark-s-MLlib-tp9727.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Tachyon in Spark
I'm interested in understanding this as well. One of the main ways Tachyon is supposed to realize performance gains without sacrificing durability is by storing the lineage of data rather than full copies of it (similar to Spark). But if Spark isn't sending lineage information into Tachyon, then I'm not sure how this isn't a durability concern. On Wed, Dec 10, 2014 at 5:47 AM, Jun Feng Liu liuj...@cn.ibm.com wrote: Dose Spark today really leverage Tachyon linage to process data? It seems like the application should call createDependency function in TachyonFS to create a new linage node. But I did not find any place call that in Spark code. Did I missed anything? Best Regards *Jun Feng Liu* IBM China Systems Technology Laboratory in Beijing -- [image: 2D barcode - encoded with contact information] *Phone: *86-10-82452683 * E-mail:* *liuj...@cn.ibm.com* liuj...@cn.ibm.com [image: IBM] BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 China
Re: Tachyon in Spark
I don't think the lineage thing is even turned on in Tachyon - it was mostly a research prototype, so I don't think it'd make sense for us to use that. On Thu, Dec 11, 2014 at 3:51 PM, Andrew Ash and...@andrewash.com wrote: I'm interested in understanding this as well. One of the main ways Tachyon is supposed to realize performance gains without sacrificing durability is by storing the lineage of data rather than full copies of it (similar to Spark). But if Spark isn't sending lineage information into Tachyon, then I'm not sure how this isn't a durability concern. On Wed, Dec 10, 2014 at 5:47 AM, Jun Feng Liu liuj...@cn.ibm.com wrote: Dose Spark today really leverage Tachyon linage to process data? It seems like the application should call createDependency function in TachyonFS to create a new linage node. But I did not find any place call that in Spark code. Did I missed anything? Best Regards *Jun Feng Liu* IBM China Systems Technology Laboratory in Beijing -- [image: 2D barcode - encoded with contact information] *Phone: *86-10-82452683 * E-mail:* *liuj...@cn.ibm.com* liuj...@cn.ibm.com [image: IBM] BLD 28,ZGC Software Park No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 China
running the Terasort example
Hi all, I just joined the list, so I don¹t have a message history that would allow me to reply to this post: http://apache-spark-developers-list.1001551.n3.nabble.com/Terasort-example- td9284.html I am interested in running the terasort example. I cloned the repo https://github.com/ehiggs/spark and did checkout of the terasort branch. In the above referenced post Ewan gives the example # Generate 1M 100 byte records: ./bin/run-example terasort.TeraGen 100M ~/data/terasort_in I don¹t see a ³run-example² in that repo. I¹m sure I am missing something basic, or less likely, maybe some changes weren¹t pushed? Thanks for any help, Tim - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
RE: Where are the docs for the SparkSQL DataTypes?
Part of it can be found at: https://github.com/apache/spark/pull/3429/files#diff-f88c3e731fcb17b1323b778807c35b38R34 Sorry it's a TO BE reviewed PR, but still should be informative. Cheng Hao -Original Message- From: Alessandro Baretta [mailto:alexbare...@gmail.com] Sent: Friday, December 12, 2014 6:37 AM To: Michael Armbrust; dev@spark.apache.org Subject: Where are the docs for the SparkSQL DataTypes? Michael other Spark SQL junkies, As I read through the Spark API docs, in particular those for the org.apache.spark.sql package, I can't seem to find details about the Scala classes representing the various SparkSQL DataTypes, for instance DecimalType. I find DataType classes in org.apache.spark.sql.api.java, but they don't seem to match the similarly named scala classes. For instance, DecimalType is documented as having a nullary constructor, but if I try to construct an instance of org.apache.spark.sql.DecimalType without any parameters, the compiler complains about the lack of a precisionInfo field, which I have discovered can be passed in as None. Where is all this stuff documented? Alex - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Is there any document to explain how to build the hive jars for spark?
Hi, all We found some bugs in hive-0.12, but we could not wait for hive community fixing them. We want to fix these bugs in our lab and build a new release which could be recognized by spark. As we know, spark depends on a special release of hive, like: |dependency groupIdorg.spark-project.hive/groupId artifactIdhive-metastore/artifactId version${hive.version}/version /dependency | The different between |org.spark-project.hive| and |org.apache.hive| was described by Patrick: |There are two differences: 1. We publish hive with a shaded protobuf dependency to avoid conflicts with some Hadoop versions. 2. We publish a proper hive-exec jar that only includes hive packages. The upstream version of hive-exec bundles a bunch of other random dependencies in it which makes it really hard for third-party projects to use it. | Is there any document to guide us how to build the hive jars for spark? Any help would be greatly appreciated.
Re: Where are the docs for the SparkSQL DataTypes?
Thanks. This is useful. Alex On Thu, Dec 11, 2014 at 4:35 PM, Cheng, Hao hao.ch...@intel.com wrote: Part of it can be found at: https://github.com/apache/spark/pull/3429/files#diff-f88c3e731fcb17b1323b778807c35b38R34 Sorry it's a TO BE reviewed PR, but still should be informative. Cheng Hao -Original Message- From: Alessandro Baretta [mailto:alexbare...@gmail.com] Sent: Friday, December 12, 2014 6:37 AM To: Michael Armbrust; dev@spark.apache.org Subject: Where are the docs for the SparkSQL DataTypes? Michael other Spark SQL junkies, As I read through the Spark API docs, in particular those for the org.apache.spark.sql package, I can't seem to find details about the Scala classes representing the various SparkSQL DataTypes, for instance DecimalType. I find DataType classes in org.apache.spark.sql.api.java, but they don't seem to match the similarly named scala classes. For instance, DecimalType is documented as having a nullary constructor, but if I try to construct an instance of org.apache.spark.sql.DecimalType without any parameters, the compiler complains about the lack of a precisionInfo field, which I have discovered can be passed in as None. Where is all this stuff documented? Alex