This is an automated email from the ASF dual-hosted git repository.

lewismc pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tika.git.


    from 0f942ec  Merge pull request #434 from lewismc/TIKA-3383
     new b97ae8a  Created basic structure
     new 62783b2  pom build fix
     new bf4f1fe  Copied function definitions over from translate and changed 
class variable types
     new a608dfa  validSourceLanguages for StartStreamTranscription added
     new e8e0a06  validSourceLanguages for StartStreamTranscription added
     new 04cff77  Check that input sourceLanguage is valid for Amazon's 
StartStreamTranscription
     new c60c937  Added AmazonTranscribeGuessLanguageTest
     new 7ddbff2  Fixing Merge Conflicts
     new e01d4a9  overwrote Interface
     new 58461cf  Merge branch 'TIKA-94' of https://github.com/rohan2810/tika 
into TIKA-94
     new 5321b69  changed exception strnig throw
     new 068120f  reduced transcribe runtime by implementing HashSet
     new 205faa1  Changed variable reference to method call
     new ffbaad5  Added javadoc desription
     new 2b82324  Instantiated bucketname, clientID, and secret in contructor
     new 931b00f  amazon dependencies added to header
     new 0160246  added amazonaws dependency
     new e3e94c2  Merge branch 'TIKA-94' of https://github.com/rohan2810/tika 
into TIKA-94
     new 65acd7e  Completed AWS audio transcribe. Lewis can you review?
     new e025d66  Merge
     new ced2ba2  Adding AmazonTranscribeGuessLaunguageTest
     new 0600a4b  Updating AmazonTranscribeTest
     new d98aebf  > removed aws from core
     new c53f73c  Package Name refactoring, more generic interface, changes in 
implementation
     new 04ce661  Merge branch 'TIKA-94' of https://github.com/rohan2810/tika 
into TIKA-94
     new 0333578  uploadFileToBucket is now a private method. key -> jobName 
TODO add documentation for the methods
     new 79030a8  Merge branch 'TIKA-94' of https://github.com/rohan2810/tika 
into TIKA-94
     new aa76fc7  Updated AmazonTranscribeGuessLanguageTest to mesh with 
AmazonTranscribe interface
     new 60d131d  Pushed changes to interface and make it AWS independent
     new ec5de38  Changed jobName from filename to auto generated UUID
     new dfd21f3  should not be creating new jobname in Upload file to bucket
     new a2c2c61  Updated AmazonTranscribeGuessLanguageTest to call 
getTranscriptResult
     new 17f2d10  fix for TIKA-94 contributed by phantuanminh: Rename package 
(Fix typo). Add simple test and test files
     new 58784b7  Resolved Merge Conflicts with other test files
     new 6921ffb  Added usage of test recourse files in 
AmazonTranscribeGuessLanguageTest
     new ad6ac1b  added documentation. Made the interface AWS independent some 
other small fixes.
     new 598edd6  no need for support of overhead conversion of mp4 to mp3
     new 6e07c57  Revert "no need for support of overhead conversion of mp4 to 
mp3"
     new cae79ac  few fixes based off comments
     new 3688d2d  Added de-DE, en-AU, en-GB, en-US, it-IT, ja-JP, ko-KR, pt-BR 
audio samples to test resources
     new 0220d57  Added tests for new audio recourse files
     new c7763b7  Added documentation and mp4 file tests to 
AmazonTranscribeGuessLanguageTest
     new b7221e4  remove video
     new 7805990  remove video
     new 84f7de6  startTranscribe-> transcribe
     new 818be72  test refactoring
     new 927ce49  dependency fix
     new dc2f979  white spaces and fixes
     new fb1a86f  imports fix
     new 600d972  added some comments
     new b5be3be  added comments
     new 09ed087  Merge branch 'TIKA-94' of https://github.com/rohan2810/tika 
into TIKA-94
     new 9d2c986  Changed startTranscribe() to transcribe()
     new a568b13  comments
     new e8ff342  String -> InputStream
     new b2bcfd3  String -> InputStream TODO:Testing -> AWS & new changes 
testing.
     new aba21b8  Merge remote-tracking branch 'origin/TIKA-94' into TIKA-94
     new f27f864  Add documentation for tests, modify and merge test files
     new f3e284d  Merge branch 'TIKA-94' of https://github.com/rohan2810/tika 
into TIKA-94
     new f633e65  [TIKA-94] Speech-to-text transcription
     new 2d0f9e2  Merge pull request #406 from rohan2810/TIKA-94

The 5223 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 pom.xml                                            |   1 +
 .../org/apache/tika/transcribe/Transcriber.java    |  60 +++
 tika-transcribe/pom.xml                            | 150 ++++++
 .../apache/tika/transcribe/AmazonTranscribe.java   | 264 +++++++++++
 .../org.apache.tika.language.translate.Translator  |   2 +-
 .../transcribe.amazon.properties                   |   5 +-
 .../tika/transcribe/AmazonTranscribeTest.java      | 527 +++++++++++++++++++++
 .../src/test/resources/ShortAudioSampleFrench.mp3  | Bin 0 -> 25861 bytes
 .../test/resources/de-DE_(We_Are_At_School_x2).mp3 | Bin 0 -> 38547 bytes
 .../resources/en-AU_(A_Little_Bottle_Of_Water).mp3 | Bin 0 -> 33365 bytes
 .../resources/en-GB_(A_Little_Bottle_Of_Water).mp3 | Bin 0 -> 35872 bytes
 .../resources/en-US_(A_Little_Bottle_Of_Water).mp3 | Bin 0 -> 29603 bytes
 tika-transcribe/src/test/resources/en-US_(Hi).mp4  | Bin 0 -> 21739 bytes
 .../resources/it-IT_(We_Are_Having_Class_x2).mp3   | Bin 0 -> 42219 bytes
 .../test/resources/ja-JP_(We_Are_At_School).mp3    | Bin 0 -> 21699 bytes
 .../src/test/resources/ko-KR_(Annyeonghaseyo).mp4  | Bin 0 -> 144151 bytes
 .../resources/ko-KR_(We_Are_Having_Class_x2).mp3   | Bin 0 -> 66843 bytes
 .../test/resources/pt-BR_(We_Are_At_School).mp3    | Bin 0 -> 29043 bytes
 18 files changed, 1006 insertions(+), 3 deletions(-)
 create mode 100644 
tika-core/src/main/java/org/apache/tika/transcribe/Transcriber.java
 create mode 100644 tika-transcribe/pom.xml
 create mode 100644 
tika-transcribe/src/main/java/org/apache/tika/transcribe/AmazonTranscribe.java
 copy 
tika-core/src/main/resources/META-INF/services/org.apache.tika.detect.Detector 
=> 
tika-transcribe/src/main/resources/META-INF.services/org.apache.tika.language.translate.Translator
 (93%)
 copy 
tika-parsers/tika-parsers-advanced/tika-parser-nlp-module/src/test/resources/org/apache/tika/parser/ner/regex/ner-regex.txt
 => 
tika-transcribe/src/main/resources/org.apache.tika.transcribe/transcribe.amazon.properties
 (88%)
 create mode 100644 
tika-transcribe/src/test/java/org/apache/tika/transcribe/AmazonTranscribeTest.java
 create mode 100644 
tika-transcribe/src/test/resources/ShortAudioSampleFrench.mp3
 create mode 100644 
tika-transcribe/src/test/resources/de-DE_(We_Are_At_School_x2).mp3
 create mode 100644 
tika-transcribe/src/test/resources/en-AU_(A_Little_Bottle_Of_Water).mp3
 create mode 100644 
tika-transcribe/src/test/resources/en-GB_(A_Little_Bottle_Of_Water).mp3
 create mode 100644 
tika-transcribe/src/test/resources/en-US_(A_Little_Bottle_Of_Water).mp3
 create mode 100644 tika-transcribe/src/test/resources/en-US_(Hi).mp4
 create mode 100644 
tika-transcribe/src/test/resources/it-IT_(We_Are_Having_Class_x2).mp3
 create mode 100644 
tika-transcribe/src/test/resources/ja-JP_(We_Are_At_School).mp3
 create mode 100644 
tika-transcribe/src/test/resources/ko-KR_(Annyeonghaseyo).mp4
 create mode 100644 
tika-transcribe/src/test/resources/ko-KR_(We_Are_Having_Class_x2).mp3
 create mode 100644 
tika-transcribe/src/test/resources/pt-BR_(We_Are_At_School).mp3

Reply via email to